It seems like the metrics used to show genetic relatedness are contradictory. We read that siblings are approximately 50% alike; we also read that the average difference between any two humans is anywhere from .1% to .6% (implying 99%+ alike); and we also read that humans and chimpanzees share approximately 99% of their DNA. Can you explain what these various ratios are based upon?
— Curious George from VA
October 24, 2019
Genetic relatedness can be a confusing topic. That is mostly because there are several different ways to compare two people’s DNA.
The short answer is that the percentage depends on how much of the DNA you’re talking about and how specific you’re being about similarities!
Your genetic inheritance
When we hear that siblings are about 50% alike, it means that they share half of their physical DNA. Our DNA is organized into big chunks called chromosomes. Humans have 23 chromosomes, which are each divided into halves called chromatids.
Image from Wikimedia
Half of each of your chromosomes is from your mom and the other half is from your dad. So, you share 50% of your mom's DNA and 50% of your dad’s.
Your sibling will also have 50% of your mom’s DNA and 50% of your dad’s. What does this mean for how much DNA you and your sibling share?
Imagine your parents each have two different chromatids for one particular chromosome. For this example, we’ll identify them by color. Let’s say your mom has a blue chromatid and a green chromatid, while your dad has a yellow chromatid and a red chromatid.
There are four different combinations of chromatids you could get from your parents: blue and yellow, green and yellow, blue and red, or green and red. In a simple case, any of these would be equally likely.
Let’s imagine you got the blue chromatid and the yellow chromatid. Your sibling could also get any of those four combinations.
If they got green and red, neither of your chromatids would match. For this chromosome, you’d be a 0% match.
If they got green and yellow or blue and red, half of the chromatids would match. For this chromosome, you’d be a 50% match.
If your sibling also had a blue chromatid and a yellow chromatid, you would have identical copies of that chromosome. You’d be a 100% match!
Of course, it’s actually a bit more complicated than this. You don’t get an entire whole chromatid from each parent – your parent’s chromatids get a bit shuffled before they’re passed on to you. This is called recombination.
So if you take this shuffling into account, four potential children might look like this:
Now imagine this across all 23 chromosomes! If we repeated this comparison over all of your chromosomes, we’d probably find a pretty even mix.
For a quarter of the DNA, you’d be a 100% match. These are places where you and your sibling got the same version of mom’s DNA, and the same version of dad’s DNA.
For about half of the DNA, you’d be a 50% match. You and your sibling got the same piece from mom, but different pieces from dad (or vice versa).
For the last quarter, you’d be a 0% match. You and your sibling got different DNA from mom, and different DNA from dad.
If you average this, you’d find that about 50% of your DNA matches that of your siblings!
It might not be exactly 50%. You might have more similarities or more differences. But you would still probably be pretty close to 50% similar!
All humans have very similar DNA
When we compared chromosomes, we were looking at the big picture. We were asking whether each segment of DNA exactly matched. We wanted to know if you got the blue version or the green version from mom, and how that compared to what your sibling inherited.
But the green chromatid and the blue chromatid are probably not all that different!
The DNA between various people is very similar. Out of 3.2 billion DNA letters, there are only a handful of places that might be different.
If you and I compared our DNA by looking at the sequence of letters, we’d see that only a very small percentage of our DNA is different. This is where we get the number that humans are 99.9% similar.
So even though you got 50% of your mom's DNA and 50% of your dad's, those two sets are almost exactly the same.
Multiple ways to compare sequences
Even when comparing DNA sequences, we might not always get the same percentage. Let’s use humans and chimpanzees as an example.
We could compare all the DNA in humans to all the DNA in chimpanzees. The easiest difference to look at is changes in individual letters of our DNA sequence. These are called single nucleotide polymorphisms (SNPs). When you do this comparison, the two species are about 99% identical.1-3
But sometimes, whole chunks of DNA might be inserted, deleted, or inverted. These changes aren’t counted as differences if we only look at SNPs.
When scientists compared the human genome to other great apes, they found more the 600,000 insertion, deletions, and inversions. Of these, 17,789 insertions and deletions were specific to humans.3 However, it’s harder to quantify these differences as a percent – how do you sum up “there” vs “not there”?*
So far we’ve compared all the DNA in the genome. But sometimes, we only compare genes. Genes are the segments of DNA that act as instructions for making proteins.
Only about 1.2% of your DNA contains genes. And SNPs in genes are a bit less common than they are in the DNA between genes. Since changing the sequence of a gene can have a large effect, any sequence differences might be really important.
If you just compare SNPs between genes, humans and chimpanzees are probably greater than 99% identical. But those few SNPs are important. When humans and chimpanzees read their DNA to make proteins, the proteins are only exactly identical 29% of the time.1
Comparing only genes becomes even more important if we are comparing to other species. For example, you might have heard that people share 50% of their DNA with bananas. This is only counting genes, since our non-gene regions are so different they can’t even be compared!
Scientists still don’t all agree about the best way to measure genetic relatedness. Sometimes we look only at genes. Sometimes we look at all our DNA but we only look at SNPs. Sometimes we look at all of the differences between our DNA sequences!
This is why the percentages don’t always seem to make sense. There are lots of different ways of comparing DNA. It’s always good to ask which one is being used!
By Alexa Wnorowski, Stanford University
* When the chimpanzee genome was first published in 2005, the authors did try to quantify this. They calculated that the humans and chimpanzees are 96% identical when considering insertions and deletions.1 More recent papers have not tried to do the same measurement, instead choosing to just count how many insertions/deletions exist. But this is why you’ll sometimes see “96%” floating around on the internet!
If you want to know more about sibling relatedness or see another example, you can read more about it in these posts:
You can read more details on recombination of chromosomes in this post:
For more information about how evolutionary biologists examine relatedness between species, check out this one:
- The Chimpanzee Sequencing and Analysis Consortium. "Initial sequence of the chimpanzee genome and comparison with the human genome." Nature (2005).
- Varki, A., & Altheide, T. K. "Comparing the human and chimpanzee genomes: searching for needles in a haystack." Genome research (2005).
- Kronenberg, Z. N. et al. “High-resolution comparative analysis of great ape genomes.” Science (2018).
Or for some more recent chimp/human comparisons, which make slightly different comparisons:
- McLean, C et al. "Human-specific loss of regulatory DNA and the evolution of human-specific traits." Nature.
- Kim, DS, & Hahn, Y. "Identification of human-specific transcript variants induced by DNA insertions in the human genome." Bioinformatics.