Golden State Killer suspect tracked down through familial DNA

How police used DNA from a public ancestry database to track down a suspect

April 28, 2018
 
After more than 40 years, the Golden State Killer is finally behind bars. Joseph James DeAngelo is accused of 12 murders, over 50 rapes, and more than 100 burglaries. His capture is being credited to -- or blamed on -- his DNA. But police didn’t originally match the DNA at the crime scene to him. Decades after his crimes, police matched DNA that family members put in a public database.
 
This has led to all sorts of questions. How did police get his DNA? How did they find his family’s DNA? How could they tell they were related?

The killer’s DNA was collected from the crime scene

Everyone is constantly shedding DNA. DNA is in nearly every cell in the body, so leaving behind a hair, some skin cells, or saliva can be enough to match to a person.
 
Criminals often leave behind even more. A struggle may leave behind the attacker’s blood or skin under the victim’s nails. In the case of a rape, the attacker may leave semen. This DNA can be used like a fingerprint to match a suspect to a crime: if the suspect’s DNA was there, he must have been, too.
 
Police don’t normally look at all of the information in the DNA. The DNA has all the information for making an entire person. But the police just need enough to match the crime scene DNA to the suspect’s DNA.
 
Since the late 1980’s, police have been using a “Combined DNA Indexing System” (CODIS for short). CODIS looks at 20 different parts of the DNA where there are different numbers of repeats. Some people have the same 20 DNA letters repeated 5 times.  Another person might have them repeated 15 times. The different number of repeats don’t really matter for a person’s life, since they don’t affect any of the person’s genes. But they do make a unique DNA fingerprint. CODIS markers are akin to lowering the resolution on a picture: you won’t be able to tell exactly what the old picture looked like, but you would be able to identify an exact match.
 
Everyone gets one copy of their DNA from their dad and one from their mom. So every person ends up with two sets of repeats. For one CODIS marker, a person can have 5 repeats from mom and 15 repeats from dad. With two copies of 20 different markers that can have many different lengths, it’s really unlikely for two unrelated people to have exactly the same CODIS markers. Even siblings only share about 75% of them. So CODIS gives a pretty unique DNA fingerprint.
 
But there was no match for the Golden State Killer’s DNA in the CODIS database. So police looked more closely at the DNA.
 
One way to get a few more pixels to your DNA picture is to look at single nucleotide polymorphisms (commonly called SNPs, pronounced “snips”). Most people’s DNA is pretty similar, which is why, on the grand scheme of things, most people are pretty similar. But everyone has a few letters different in the giant book that is our genome. A SNP is when a single letter (scientifically, a “nucleotide”) is swapped for another . Scientists don’t even need to look at every single SNP to get a better picture of someone’s DNA, although they usually look at hundreds of thousands of them.
 
But this still leaves the same problem: if police didn’t have DeAngelo’s DNA, how could they match it to the Golden State Killer’s?
 
Family members can be identified because they share similar stretches of DNA
 
Everyone has 23 pairs of chromosomes (hence the company name, “23andme”). A chromosome is just one really long, continuous stretch of DNA. So one chromosome comes from your mom and one from your dad. Your mom got one chromosome from your grandma and one from your grandpa. But it’s not like your mom just gave you your grandpa’s chromosome.
 
Before parents pass on their DNA, it goes through a process called “crossing over”. Part of grandma’s chromosome swapped with part of grandpa’s chromosome inside your mom. This means that the chromosome mom gave you has both some of grandpa’s and grandma’s DNA. What parts get swapped is kind of random. Your sibling will end up with some of the same and some different parts of your grandparents’ DNA.
 
"Crossing over" mixes up chromosomes
 
The important part here is that you end up with stretches of DNA from one grandparent or the other. All the SNPs on that part of the chromosome are linked together -- in a line on the “rope” of the chromosome. It’s pretty easy for a computer to find the stretches of SNPs that are the same between two people. You can even figure out how closely related two people are by measuring how long these stretches of matching SNPs are.
 
On top of all that, only the very tips of the Y chromosome can cross over with the X. This means fathers give almost the same Y chromosome to their sons as their fathers gave to them. All men on the same paternal line (think: have the same last name) have nearly identical Y chromosomes.
 
Essentially, if the police could get some DNA from the family of the Golden State Killer, they could get a major lead.
 
People who are related share blocks of DNA
 

Police sketch of the Golden State Killer (from Wikimedia)

Police found family members’ DNA on a public genealogy website

The killer was ultimately done in by his family uploading their SNP data to a website called “GEDmatch”. The idea is pretty simple: if you know what SNPs you have, you can find relatives already in the database. Most people figure out what SNPs they have by doing a spit kit through a company like 23andMe or Ancestry. These services let you submit your own DNA and find family members within the service. But if your long lost cousin did 23andMe and you did Ancestry, you’d never find each other.
 
GEDmatch lets people who got SNP data through any company upload their information and, for free, will find family members. Anyone can upload anyone’s SNPs to look for matches.
 
The police had the Golden State Killer’s DNA. They just needed to upload it to the website, which is completely within the terms of service, and, ping, they found matches. They didn’t find an exact match to the killer, but they did find his family. They were then able to look for relatives that were about the right age and lived in the right places at the right time, leading them to DeAngelo. But they still hadn’t made a perfect match between DeAngelo and the Golden State Killer.
 
Police nailed DeAngelo by taking a new “discarded” sample from him
 
Police have been intentionally vague about to how they got DeAngelo's DNA. They have said that it came from “discarded DNA”. This could be anything from a cup or straw to a piece of pizza left in public. Public includes the trash can from his home that he left on the corner. This DNA ended up being a perfect match to the Golden State Killer’s. 
 
Although this solves the cold case, it brings to light even more questions. Is it okay to use these public databases to search for criminals? Is it okay to track these criminals through family members? Do people fully understand how their data might be used? Is it okay to take a suspect’s DNA without a warrant? Given that everyone is constantly shedding DNA, is it really discarded?
 
The short answer is that this legal and not far from what has been done before. Police caught the “Grim Sleeper” after his son came up on a CODIS marker search. The DNA that confirmed his guilt came from a discarded pizza crust. Public databases have also been used to help solve cold cases: the DNA Doe Project has successfully identified “Jane Does,” notifying families of the deaths of missing loved ones.
 
Not all DNA trails lead to the true culprit. In 2015 police matched DNA from a murder to a public database, and identified potential relatives of the perpetrator. From there, they identified Michael Usry as a suspect. He was a close relative of a partial DNA match, in the right place at the right time, and made films on violent crimes. Police took him in for questioning and took a DNA sample. In this case, when they ran his DNA it did not match the sample from the crime scene, and he was exonerated.
 
Legal doesn’t necessarily mean ethical. Just as people are actively working to advance our ability to understand DNA, others are working on untangling the ethical, legal, and social implications of these new technologies.
 
By Nikki Teran, Stanford University

GEDmatch lets users find DNA relatives and analyze their own data

 

 

 

 

Congress requires the National Human Genome Research Institute (one of the primary government funding agencies for genomics research) to spend at least 5% of its money on researching the ethical, legal, and social implications of genetics (Image from Wikimedia)