Thursday, December 8, 2011

Questions from my Inbox: Siblings

I was looking at my site’s traffic, and I notice that an awful lot of people find my website by googling questions about the DNA shared between siblings.

I thought it might be useful to answer some of these questions and clear a few things up.

“What percent of my DNA is the same as my sister’s?”
“How much of my DNA is shared with a sibling?”
“How much DNA should brothers and sisters share?"

The short answer is that siblings share an average of 50% of their DNA.

Here’s why siblings share 50% of their DNA.

Your mom has two copies of every piece of DNA and your dad also has two copies. To make a child, each parent passes on one copy of their DNA – consequentially, the child will also have two copies of that piece of DNA.

I have drawn an example below.

Let’s say that on this particular piece of DNA, you are outcome #3 – ‘green and purple’. You got a ‘green’ copy of DNA from your mom and a ‘purple’ copy from your dad.

Your parents can produce a child with any one of the four combinations that I drew. Take a look at how the four possible outcomes compare to your outcome (green and purple).

There is a one out of four chance (25%) that a new child of your parents would have an identical match as you (that is to say, they would also get a green gene from mom and a purple gene from dad).

There is a two out of four chance that the new child would be only half-identical to you. If the child was outcome #1, they would have an identical gene from your dad (purple) but a different gene from your mom (red instead of green like you). Outcome #4 is the same except reversed. Your sibling would have an identical gene from your mom (green) but a different gene from your dad (blue instead of purple).

There is a one out of four chance that the sibling would inherit a different copy of both genes. This is what happens with outcome #2. This sibling got a red copy of your mom’s DNA, while you got a green copy. He also got a blue copy of your dad’s DNA, while you got a purple copy instead.

To summarize, at any given point in the genome, there is one way to be fully identical with your sibling, two ways to be half-identical, and one way to be non-identical.

Said another way, across the entire genome, you will share about ¼ of your DNA fully identical with your sibling. You will share about ½ of your DNA half-identical with your sibling, and about ¼ of your DNA will be non-identical compared to your sibling.

25% identical DNA (100% match – same from mom AND dad) = .25 * 100 = 25%

50% half-identical DNA (50% match – same from mom OR dad but not both) = .50 * 50 = 25%

25% not identical DNA (0% match – different from both mom and dad) = .25 * 0 = 0%

25% + 25% + 0% = 50%. Siblings share an average of 50% of their DNA.

Okay, I’ll rephrase this: “Why do I only share 45% with my sister?”

50% is only an average percentage. This percentage can and does vary quite a bit. On the thread that I started on 23andMe about shared DNA percentages with relatives (23andMe members can read that thread here) the highest reported sibling percentage was 58.27% and the lowest was 40.99%. Sharing 45% with a sibling is lower than average, but it's nothing unusual. 
"Will my siblings get the same 23andMe results?"

Not really.

Some aspects of your 23andMe results will be the same – for example, your maternal and paternal haplogroups.

Most of your results will not be the same as your sibling’s results, because, like we discussed earlier, you and your sibling each get a unique combination of DNA from your parents. Your health results will most likely be somewhat similar, but they will not be identical. Same with Relative Finder – your results will be similar. Because you share ~50% of your DNA with a sibling, you will share roughly half of your relatives in common with your sibling (but around half will be different for each person).

"Can one sibling have a closer DNA match to the parents than the other one?"

Sort of. It depends on what you mean by 'closer DNA match'. Every child gets a complement of 23 chromosomes from each parent -- this does not vary. However, a father passes along a Y sex chromosome to a son, and an X sex chromosome to a daughter. In terms of size, the X chromosome (left) is a lot bigger than the Y chromosome (right). You could say, based on this size difference, that a father gives more DNA to his daughter compared to his son, and that therefore a daughter has a closer match to her father than her brother does. Some companies, such as 23andMe, will report that a man shares ~47% of DNA with his son and ~50% with his daughter. So in this respect, you could say that one sibling has a closer DNA match to the parent than the other sibling does.

However, something to keep in mind is that the parent-child relationship does not vary like other relationships do. For example, we already talked about the wide range of percentages that siblings can share (on my thread, they share anywhere from about 41% to 58%). Parents and children do not vary like this -- it's always 50% from mom, 50% from dad.

"Can DNA test if you are full sisters without the parents?""Can sisters determine if they are true sisters through DNA testing, even if both parents are dead?"

Yes, definitely.

Sibling relationships are very unambiguous, with or without the parents.

On 23andMe, sibling comparisons look like this. The percentage of DNA shared is approximately 50% and there are both fully-identical segments and half-identical segments (as well as some non-identical segments). No parents are needed to identify a sibling relationship on 23andMe.

For what it's worth, here's what a half-sibling comparison looks like.

For the most part, half-siblings share only half-identical segments -- no fully-identical segments. (An exception to this rule would be seen with maternal half-brothers on the X chromosome. Men have only one X chromosome, so any match between two men on the X would show as fully-identical.)

Also, here's what a comparison between two unrelated people looks like.
It's blank, with no shared segments.

You can see that these various outcomes are very distinctive from one another. 23andMe can definitively determine whether two people are full siblings or not.

I hope this helps answer some of your questions about siblings and their shared DNA! If you've got a question for me, leave a comment or email me at:

Monday, October 3, 2011

Utilizing Your Relatives on 23andMe -- Health Info

Have you ever been curious about which side of the family a particular trait comes from? Ever wondered what traits you and your relatives share in common?

Your 23andMe info can help you find the answers to these sorts of questions.

I will show you how this is done. The first thing you will want to do is download my spreadsheet -- it can be found here. It contains all the info about the location in the genome of the traits and conditions that 23andMe tests for.

The first half of the spreadsheet is in alphabetical order, so that you can quickly find a certain condition that you may be interested in. The second half of the spreadsheet (starting at row 630) is in order by location, so that you can see which traits and conditions fall along a particular segment that you are interested in. After going to the link, you can download the file and save it as you wish so that you can manipulate it to your own preferences.

So how can you use this information?


Let's say you are curious about which side of the family you might have inherited a certain gene from. As an example, I'll choose APOE, the gene that 23andMe tests for regarding Alzheimer's disease.

My niece Mikayla and her great-grandmother Pearl have both been tested on 23andMe. As you can see, they both have an APOE3 allele -- the question I want to answer is, 'Did Mikayla inherit that APOE3 allele from Pearl?'

(all of the images in this post can be enlarged by clicking on them)

The first thing that I need to do is check the spreadsheet, so that I know where the APOE gene is located.

The gene is on chromosome 19, at approximately location 50,000,000.

The next thing I need to know is whether Mikayla shares a segment at that location with Pearl, her maternal great-grandmother. There are several places I can go to find this information -- Family Inheritance or Family Inheritance: Advanced. I'll choose Family Inheritance because it has a feature that allows you to check for a specific gene -- I can enter APOE in here and see what the exact segment looks like:

This comparison shows that Mikayla and Pearl DO share a segment at the location where the APOE gene is. This means that Mikayla DID inherit her APOE3 allele from her maternal great-grandmother Pearl (we can also conclude that she inherited her other APOE allele, the APOE2 allele, from her father).

I should add that this technique is most useful when you do not have the exact same genotype as your relative. For example, here's what my paternal first cousin and I have for the eye color gene:

We do share a segment at the location where the eye color gene is. But, no real conclusion can be made -- all this tells me is that I inherited either an A or a G at this gene from my dad, which I already knew (everyone gets one allele from their mom and one from their dad).

Your mileage may vary, depending on what question you are asking and what relatives you have available for comparisons, but in general, it's best to use a relative that you only have one allele in common with (for example, in my Alzheimer's example, Mikayla is APOE2/APOE3 and Pearl is APOE3/APOE4. They only have the APOE3 allele in common).


Let's say you want to take a different approach -- rather than having one specific trait in mind that you want to look at, you want to know which traits or conditions fall along a particular segment that you share with your family member.

First, you need to know which segments of DNA you and your relative share. Here is my comparison with my paternal half-sister. I'll choose chromosome two as an easy example to look at, because my half-sister and I happen to share the full length of our dad's chromosome two with one another.

The next step is to go to the spreadsheet and look and see what traits are shared at this location.

For all of these genes, we inherited the same allele (same version of the gene) from our dad. So what does this tell us? Let's look at one of these traits -- I'll choose lactose intolerance.

Let's take a look at what we see. I am AG at this gene, meaning I have one 'lactose tolerant' allele and one 'lactose intolerant' allele. My half-sister is AA, meaning that she has two 'lactose tolerant' alleles. Because I know we inherited the same allele from our dad, and we have only one allele in common, I know that we inherited our A allele from our dad. Hopefully my reasoning is clear.

I can make several conclusions based on this data. One I already mentioned -- my sister and I inherited an 'A' lactose tolerant allele from our dad. Also, because it takes only one lactose tolerant allele to be lactose tolerant, I know that my dad was lactose tolerant.


I'll go through one more example, this time with a more distant relative -- my aunt's second cousin.

Here is my aunt's comparison with her second cousin.

I highlighted the segment I want to focus on -- the shared segment at the end of chromosome nine. I want to find out which traits or conditions they share from this segment.

So my next step is to go to the spreadsheet:

Here is a list of traits that they inherited from that chromosome 9 segment. I'll choose blood type as the trait I want to focus on. Let's take a closer look at what their genotypes are for ABO blood type.

So what does this mean? The fact that they have a shared segment at the ABO blood type gene means that at that gene, they inherited a common allele from their common ancestors. They have only one allele in common -- the A allele.

Here's their family tree:

From the DNA evidence, we know that Jason and Dorotha Woodward had at least one A allele between them. One of them (no way to tell whether it was Jason or Dorotha) passed this same A allele to each son, William and Hermon Woodward. William passed the A allele to his son Bruce, who also passed it on to his son. Bruce's son has two A alleles -- he got one from Bruce and one from his mother.

In turn, Hermon passed on his A allele to his daughter Audra Woodward, who then passed it onto Aletha. We can conclude that Aletha got her A allele from her mother and her O allele from her father.


So, to summarize:

The first step is to decide what question you want to ask.


If you have a specific trait in mind (like with the Alzheimer's example) and you want to know which side of the family you inherited a certain allele from, here's what you want to do:

Step One: Go to the spreadsheet and figure out the relevant information. You need to find out where (which chromosome and what position) that trait is located.

Step Two: Go back to 23andMe and see if you've got a relative that will help you. Most likely, you are going to want a relative that (1) has a different genotype than you -- only one allele in common, (2) shares a segment with you at the relevant location and (3) is related to you on only one side of your family.

Step Three: If you have a relative that fits the criteria (also, keep in mind that the criteria will vary based on what type of questions you are asking) -- you can make a conclusion about the trait you selected.


On the other hand, if you don't have a specific trait in mind, but you want to look at a specific shared segment and then go from there to figure out which traits were inherited via that segment (like with the second cousin blood type example), here's what you want to do:

Step One: Go to the spreadsheet and figure out the relevant information.  You will probably want to go to the bottom half of the spreadsheet and find the segment, so that you can see what traits fall along that segment.

Step Two: Take a look at the genotypes of you and your relative for those traits. Due to the fact that you have a shared segment with your relative, you know that you and your relative must have at least one allele in common from your shared ancestor for all the relevant traits. The exact conclusions you can make will vary based on your relationship with your relative.


I think that about covers it!

If you have a question or want me to investigate a specific scenario in your family, please comment on this post or drop me an email at:

Also, if you're going to write about this on your own website, I would love a link back to my website! Thank you!

Wednesday, August 3, 2011

My Close Relatives on 23andMe: Paternal First Cousin Once Removed

I recently got DNA results for my dad’s paternal first cousin. She is the daughter of my dad’s father’s brother, making her my first cousin once removed.

The percentages of DNA that she shares with our relatives are all over the map. I found these results fascinating because it shows that recombination is random & incredibly hard to predict.

Leona, my dad’s first cousin, has two first cousins that have also tested their DNA, Dan and Aletha. The theoretical percentage shared by first cousins is 12.5%. Dan and Aletha both deviate from this percentage – Aletha shares more than expected, while Dan shares less. This kind of variation is pretty common and not unexpected.

Things start to get more extreme when I look at the comparisons between Leona and my three sisters & I.

One of my sisters shares more than twice as much DNA with Leona as my other sister.

Leona and my sisters and I are first cousins once removed, and are therefore expected to share only half the amount of DNA as first cousins (6.25%). However, Leona actually shares more DNA with one of my sisters than she does with her true first cousin. This is pretty odd – it just goes to show that recombination is often very hard to predict. 

The comparison between Leona and my sister and her two daughters further demonstrates this fact. My sister and Leona are first cousins once removed – this relationship has an expected theoretical percentage of 6.25%. My sister’s children and Leona are first cousins twice removed – this relationship has an expected theoretical percentage of 3.125%.

My sister has a slightly lower than expected percentage of shared DNA with Leona – 5.28% as compared to the expected 6.25%. A mother passes on 50% of her DNA to her children, so it would be fair to assume that the children should share about 2.64% with Leona. This is not what we see, however. By random chance, one of her daughters happened to inherit much more than 50% of the shared DNA, while the other daughter happened to inherit much less.
Even though her mother shares less than expected with Leona, daughter number one inherited a large percentage of Leona's shared DNA from her mother (about 60% rather than 50%). As a result, her percentage with Leona is almost identical to the theoretical percentage for first cousins twice removed – 3.10% compared to the expected 3.125%.

In contrast, daughter number two inherited hardly any of her mother’s shared DNA with Leona. (about 20% rather than 50%). As a result she shares only about one third of the expected percentage – 1.16% compared to the expected 3.125%. 23andMe predicts that they are third to fourth cousins because they share such a small percentage of DNA.

I have really enjoyed looking at our comparisons -- I think it is very interesting the way it can vary so much! Please comment if you have any questions or need any clarification.

Sunday, May 29, 2011

New v3 Results from 23andMe

I recently got my v3 (version 3) results from 23andMe. It is the latest technology -- there are now 1 million SNPs tested versus about 500,000 on the older version. There are a couple of new health results (namely Alzheimer's & the APOE gene) available on the new chip, and my comparisons with my relatives did change slightly, but overall things stayed about the same.

Here's how my v3 results compare with my v2 results.

I show up as an identical twin to myself, tee-hee. This speaks to the accuracy of the test provided by 23andMe -- when the test is repeated, you get the exact same results.

Here is an example of how a health result changed. Here is what my result for "Coronary Heart Disease" looks like on my v2 profile (the old results):
My v3 results show slightly different results:
They have changed because more SNPs are available.
The v2 results are on the left and the v3 results are on the right. It might be hard to see, but the little black circles on the v2 results say "NG" for "not genotyped." This means that not all of the SNPs used in the coronary heart disease report are available on the v2 platform (which makes sense because the v2 chip has fewer SNPs than the v3 chip). But all of the SNPs are available on the v3 chip, and so this refined information is why my results are a little different.

Let's move to the ancestry section now.

For most of my close relatives, my comparison has changed slightly.

Here is my DNA comparison with my sister (who was genotyped on v2) -- here is how she matches with both my "v2 self" and my "v3 self." The biggest change was that what appeared to be two segments on chromosome 10 has changed to one segment on my v3 profile. A couple of other segments have also changed, and overall, my percentage of DNA shared with my sister dropped by about one half of one percent. All segments that have changed are circled in red.
Here are my DNA comparisons with my two half-sisters (both genotyped on v2). In both cases, there have been some small changes. For both of my half-sisters, what appeared to be two segments (with a small break in between) on chromosome 13 has changed into one segment. The percentage of DNA that I share with my half-sisters decreased very slightly.
Here are my DNA results with my half-sister's two daughters (my half-nieces). They were genotyped on v3. In this case, the results barely changed at all. Two segments are ever-so-slightly different (by only a few SNPs), but overall the percentages of DNA that I share with them remain virtually unchanged.
Here are my DNA results with my paternal aunt and my paternal cousin. They are both on v2. I could find no discernible difference in the DNA shared with my paternal aunt, although the percentage of DNA shared did slightly change. The percentage of DNA shared with my cousin also changed very slightly, and one segment is a little bit different.
Here are my DNA results with my maternal uncle and my maternal aunt. My maternal uncle was genotyped on v3 while my aunt is on v2. My results with my maternal uncle did not change at all, while my results with my maternal aunt changed slightly on one segment.
Here are my DNA results for my maternal great-uncle and my maternal great-aunt. They are both on v2. I could find no discernible differences in the segments, although the percentage of DNA I share with my great-uncle did change slightly.
Overall, I saw very few changes.
The most dramatic change was with my sister -- I suspect perhaps this is because I share both fully-identical and half-identical segments with her (I am related to all of my other relatives on only one side of my family, so I share only half-identical segments with them). Not sure, though -- it could just be a coincidence.

My more distant relatives from Relative Finder stayed the same for the most part, but there were some slight changes. I gained two relatives that weren't there on the v2 chip (as far as I can tell, I did not lose any). Of my existing 203 relatives that I am sharing with, the percentages stayed the same with only a few exceptions. A few segments changed only very slightly, like this:
10 of my relatives either gained or lost .01%, and one lost .02%. So no dramatic differences here -- for 192 of my relatives the percentages stayed exactly the same.
I think that about summarizes my v2 profile versus my v3 profile. Overall, there are not a whole lot of differences except for the new Alzheimer's report and the APOE info. New health results are coming in every month though, so perhaps in the future there will be more new features on the v3 chip that are not seen on the v2 chip.

Saturday, May 7, 2011

Seattle Times Historical Archive

As a student at the University of Washington, I get free access to the Seattle Times Historical Archive. I am taking this opportunity to do some more research on my poor, neglected Gemmills. It's kind of ironic that I spent the least amount of time researching the family that lived closest to me (From my apartment, I can see the cemetery where they were buried, and if I was feeling adventurous I could bike to the home where they lived). For whatever reason, though, I just don't get around to studying this line very often.

But today I have already come across some interesting things, even a picture of my great-grandmother Jessica McCarthy (a very grainy picture, but a picture nonetheless!)

I also found their wedding announcement:

I also found a pretty interesting article about their son (my grandfather) George Gemmill.

This article is pretty funny to me, because although I never met my grandfather, I have always heard about how he was a very austere person who loved rules. For example, he would punish his kids for cutting butter off the wrong end of the cube! I like this article because it shows that he actually DID "live on the wild side" at least once in his life.

So that's what I have been up to lately. And if anyone needs a lookup in the Seattle Times, let me know! :)