Are there any new sequencing technology coming out after Illumina? How do they work? How are they compared to Illumina sequencing?
— A curious adult
August 6, 2019
There are several other sequencing platforms that come out after next-gen sequencing, and each has their own advantages. Here I’ll mainly focus on methods developed by Oxford Nanopore Technologies and Pacific Biosciences, which are often referred to as third generation sequencing.
Basics of DNA sequencing
Most sequencing technologies are based on a natural process called DNA replication, where DNA copies itself.
During DNA replication, DNA unzips its double-helix into 2 single strands. Then new nucleotides are added one by one, zipping those single strands back into two complete DNA helixes. This cycle repeats until the entire DNA is copied.
How can this process help DNA sequencing?
If we could see each new nucleotide as it’s added, we would know the sequence of that DNA.
This is a bit tricky. DNA replication normally happens inside a cell. And it’s really, really fast.
There have been all sorts of sequencing techniques to try to get around these limitations. If you want to read about two common methods check out this post on Sanger Sequencing and Next-Gen sequencing.
Sanger Sequencing and next-gen sequencing are both great methods, and they’re still commonly used today. But they have some limitations:
- Sanger sequencing can only look at 1 piece of DNA at a time, and can read about 1000 letters. That makes it expensive if you want to look at an entire genome!
- Next-gen sequencing (Illumina) can read millions of DNA pieces at the same time, but only for 200 letters each. That can make it hard to assemble all those short pieces into a complete picture.
Third gen sequencing: Long reads
Third generation sequencing is all about DNA read length.
In next-gen sequencing, DNA is broken into short pieces, amplified, and then sequenced. Third generation technologies do not break down or amplify the DNA: they directly sequence a single DNA molecule.
Why do we want to sequence longer or single molecule DNA?
Long reads contain more information compared to short reads. They’re useful in a variety of situations, such as genome assembly and detecting rare variants.
For example, the human genome has a lot of complex regions. This includes repetitive regions, where a few bases might be repeated thousands of times! These repetitive regions are hard to interpret. Does the repetitive bit last for 200 letters? 2000? It’s hard to tell if all your reads are only 200 letters long. But if you had a long read, that included the entire repetitive bit, you could tell where the repeats start and end.
Long reads can also be better at detecting certain types of variants. For example, it’s hard for short reads to detect large DNA insertions. If you only have 200 letters of a new bit of DNA, you might not know where in the genome it belongs! You’re missing some important context. But if you have a long read, you can see the neighboring sequences too. This tells you where the insertion is located.
Sanger sequencing is good at sequencing long reads, but it’s not high-throughput. Next-gen sequencing is high-throughput, but cannot sequence long reads. What if we want to sequence long reads with a high-throughput platform?
Boom! Here comes third-generation sequencing.
Pacific Biosciences (PacBio)
The Sequel system was first released in 2015, by the Menlo Park-based company, Pacific Biosciences.
The platform uses dye labeled nucleotides and a flow cell. The flow cell looks like a small locket, and has thousands of tiny wells on it. The use of the flow cell allows thousands of DNA to be sequenced at the same time.
Every well has a polymerase molecule attached inside. Polymerase does its regular job: fill in nucleotides on a single-stranded piece of DNA. Each time it adds a base, a camera takes a picture. And since each base is labeled with a different color, we can tell which letter was added.
However, the polymerase works super fast. It’s hard to see what is added, which creates errors. The errors can be reduced by reading the sequence several times.
Oxford Nanopore Technologies
MinION was made commercially available in 2015 by Oxford Nanopore technologies. Unlike the fridge size PacBio machine, the MinION is only the size of an iPod. This small size makes it very portable -- scientists can take it with them to places that might not have sequencing centers.
The principle of MinION is different than Pacbio. It does not use dyed nucleotides. Instead, the MinION is based on the fact that each nucleotide is a different size and has different electrical properties.
Nanopore also uses a flow cell to allow massive parallel sequencing. Instead of glass, their flow cell is made of an electrical resistant membrane. This membrane has thousands of tiny pores, each with a diameter of one nanometer (hence the name!).
During sequencing, a steady current is applied to the nanopore. The current is dependent on the size of the pore opening. If something is in the pore, the current will change.
Single strand DNA goes through the pore, like a needle and thread. Because the size of each nucleotide is different, the size of the pore opening will be different for each nucleotide. And since the electrical current changes based on pore size, each nucleotide will have a unique electrical signature.
Because the DNA passes through the pore so fast, the readouts tend to have a high error rate. Fortunately, errors can be reduced by reading both strands through the pore.
Comparison between Illumina, PacBio, and Oxford Nanopore
They all have their distinct characteristics. Depending on the application, one technology may be better than others. And it’s not uncommon to sequence DNA with two techniques, to combine the strengths of both!
By Allison Zhang, Stanford University
Images from this post come from: