Nanopore sequencing can produce incredibly long reads well into the kilobase and even megabase range. This is especially useful for sequencing larger DNA molecules. Whole plasmids and large PCR fragments can be sequenced in one “reaction” without the need for primers or “primer-walking”. Sanger sequencing, by comparison, produces much smaller read lengths of ~700 to 1,000 bp.
The DNA molecule to be sequenced is first “tagmented” i.e. the molecule is fragmented and barcoded simultaneously. The barcoded fragments are pooled and an adapter is ligated to the molecule ends. The prepared library is then loaded onto the flow cell where the barcoded molecules are threaded through nanopores, creating distortions of electrical currents present in the flow cell. Those distortions produce characteristic patterns that are recorded and converted to base-calls. In the final step, the base-called reads are assembled into a final consensus sequence.
- Each technology has its strengths and use-cases.
- At Eton, we consider these technologies to be complementary. Nanopore sequencing is useful for sequencing whole-plasmids and large linear DNA fragments, but it may not be the best for sequencing small fragments, screening clone libraries, or sequencing templates containing homopolymeric regions. Sanger sequencing may be more useful in those cases
- Sanger sequencing is a tried-and-true method and remains the only way to reliably sequence long homopolymers, e.g. templates containing poly-As.
- Many Eton customers now submit a single sample for both Sanger and Nanopore sequencing: Sanger to read through a poly-A region, and Nanopore to sequence the rest of the plasmid.
- Eton excels at DNA plasmid prep at large scale. Our addition of whole-plasmid Nanopore sequencing to our lab has further enhanced our ability to deliver quality preps. (Whole-plasmid Nanopore sequencing is now available as an add-on).
- Many customers who order plasmid preps at the Maxi scale and above now request whole-plasmid Nanopore sequencing to fully validate their plasmid before they commit to months-long downstream work (e.g. transfections into mammalian cells).
- There is research showing that plasmids often have unseen mutations in their backbones or may even be contaminated by other plasmids. With whole-plasmid Nanopore sequencing, you can sequence and validate all the critical regions of your plasmid (e.g. CRISPR, gRNA, promoters, reporters, etc.) so you can proceed with confidence.
- Combine your Nanopore and Sanger sequencing data for the highest confidence.
Eton customers using our Nanopore sequencing service will receive, among other file types, a fastq file. fastq is a text-based file format commonly used to store sequences and corresponding quality scores. Most sequence data analysis software programs (e.g. SnapGene, Geneious, etc) have the ability to view fastq files. The fastq file represents the consensus sequence generated from the reads collected during the sequencing run for a given sample. Each base position will have a corresponding quality or "q" score representing the confidence level of each base call, i.e. the higher the q score, the higher the confidence.
- The Quality Score or Q score measures the probability that a base has been called incorrectly. A higher Q score indicates higher confidence in the base call. A lower Q score indicates lower confidence in a base call. Q scores are on a log scale:
- Q10 = 1 in 10 probability of incorrect base call = 90% base call accuracy.
- Q20 = 1 in 100 probability of incorrect base call = 99% base call accuracy.
- Q30 = 1 in 1000 probability of incorrect base call = 99.9% base call accuracy.
- Customers ordering whole-plasmid sequencing receive a “Clone Validation Report” which is a report summarizing quality information about your nanopore sequencing run, including the Mean Quality or Q score calculated over the length of the assembled sequence. A Q score of 30 is a widely accepted benchmark for NGS. Eton applies the Q 30 benchmark as a threshold for what we consider passing QC in our lab.
Eton provides a "Clone Validation Report" with information about your sequencing run. This report includes a read-length histogram for each sample which shows read lengths (x-axis) versus the number of reads (y-axis). The distribution of reads for a clonally pure and high-quality plasmid should show a strong peak at approximately the total size of the plasmid. This main peak represents whole plasmids that were "tagmented" or cut an average of one time per molecule and then passed through a pore as a single, contiguous length of DNA. To the left of the main peak, read-length histograms will often show smaller bars, representing sheared plasmid DNA and/or small amounts of E. coli genomic DNA that may be present in the sample. If the histogram shows a strong secondary peak at a size other than the expected size, this may represent a contaminating plasmid, and the sample should be investigated further. Sometimes, the histogram may show small-to-medium peaks at sizes corresponding to double or triple of the expected plasmid size, e.g. a 5 kb plasmid may show a small-to-medium peak at 10 kb and again at 15 kb. These peaks may represent concatemers of the plasmid, i.e. a dimer and a trimer, respectively. Concatemers may result from plasmids containing repetitive or other unstable sequences. Customes may want to consider preparing their plasmids in recA- E. coli strains, .e.g. Stbl3.
- Yes. With nanopore sequencing, it can sometimes be difficult to resolve homopolymeric regions (e.g. a poly A sequence). Q scores may drop in these regions and indels may be called. For this reason, Eton customers may choose to request Sanger sequencing to fully resolve a poly A or other homopolymer. By aligning both their nanopore and Sanger data, a higher confidence consensus can be generated. This is why, at Eton, we consider these technologies to be complements to each other, rather than a new technology totally replacing an older one.