EVENTS | VIEW CALENDAR
New generation of sequencing technologies can replace quantitative PCR in mainstream R&D
The most powerful advance in genomics of the last five years has taken place in DNA sequencing technology. Capillary and gel-based Sanger sequencing has been superseded by new massively parallel approaches that offer several orders of magnitude greater sequencing capacity, at >500-fold reduction in cost per base. Commercial next-generation sequencing (NGS) providers are heading towards the goal of a $1,000 human genome at a remarkable pace.
For the molecular biologist, this genomics technology has been a great boon. Transcriptome sequencing provides a snapshot of cellular nucleic acids with unparalleled depth and precision, DNA-encoded libraries of chemical targets enhance the power of selection experiments, and resequencing of target loci provides high-resolution maps of genomic variation within large cohorts of patients. The question for researchers has really become, "what else can be done with this powerful new tool?"
Fewer false positives, more reliable quantification and lower cost
The principle advantage offered by NGS technologies is single-base resolution of nucleic acid targets. If a nucleic acid sequence is returned, then the presence of this molecule within the sample is almost beyond doubt.
By comparison, identification of PCR products has previously been primitive, limited to the sizing of amplicons using DNA-binding dyes, and/or assumptions as to the specificity of PCR primer binding being sufficient to identify the correct product (only the 3' 6-8 nucleotides of a primer are absolutely required for polymerase recruitment and initiation).
In the case of TaqMan real-time PCR, the recruitment and displacement of fluorescently conjugated oligonucleotide is employed as a surrogate measure of correct target amplification, but the existence of pseudogenes (for instance, there are an estimated 52 GAPDH and 15 ACTB processed pseudogenes, in the human genome1) greatly complicates specific primer design. Ultimately, the only guarantee of amplicon identity is to sequence the PCR product.
False positives are target-dependent and variable between experimental conditions and reagents, in some scenarios as high as 1 percent, even for U.S. Food and Drug Administration-approved diagnostic tests. Most researchers do not know the false positive rates of primers used for everyday assays such as mouse strain genotyping, yet mistyping errors can cost many months of research time. Use of PCR end-points for clinical trials places the outcome of a significant expenditure upon PCR amplicons that may have been sequenced once in validation, but never again. The benefits of sequence-level resolution for absolute certainty are clear.
Specific quantification of nucleic acid species and detection of variants such as differentially spliced isoforms is most readily achieved by sequencing, with comparison to a calibration curve of known spike-in controls. A particular advantage of sequence resolution is the ability to generate highly specific control reagents that mimic the behavior of the endogenous target, yet can be distinguished for quantification. R2 value of sequencing experiment replicates are usually superior to real-time PCR quantification2, and the parallel survey of many amplicons generated from a target sample allows normalization of variance due to individual primer characteristics relied upon by real-time PCR.
Single-nucleotide resolution enables multiplexed identification of thousands of individual nucleic acid molecules within a single sample. An important utility of this property is sample multiplexing, the ability to combine PCR amplicons from many experiments if appropriately labeled with a nucleic acid "barcode" that allows the data to be segregated back into individual samples post-sequencing. Combining many samples into a single run reduces costs. The resulting cost of sequencing (~$1 per sample) compare favorably with the price of Syber Gold and other fluorescence-based detection systems, yet have significantly improved data quality.
Diversification in sequencing market enables lower-cost benchtop sequencers
To date, the sequencing market has been dominated by core facility-based, second-generation sequencing machines that offer generous volumes of data at a price point that limits regular use, except in the genomics space. This year has signaled the emergence of scaled-down second-generation technologies that offer reduced bandwidth sequencing using benchtop machines with lower per-run costs, and the promise of third-generation technologies from several companies that can reduce run time to under one hour and cost less than $100.
A role for sequencing in pharmaceutical R&D
NGS has raised the interest of both bench industry researchers and board members, in part evidenced by the strategic agreement between Biomerieux and sequencing provider Knome. A key question, however, is, "what commercial advantage can sequencing experiments offer?"
From a genomics perspective, having sequenced the human genome, the next challenge is to understand the variation that underlies individual susceptibility to disease, which necessitates targeted resequencing of key genes and expression analysis of these genes in affected individuals. An attractive advantage, for instance, of RNA sequencing is the ability to measure the level of SNP expression for many genes without a prior knowledge of the SNPs present in the sample. Sequencing can provide quantitative snapshots of the whole transcriptome, or for a lower-cost, focused analysis of selected target genes across many samples.
In the microbiology lab, ribosomal DNA sequencing has provided the opportunity to study the diversity of bacterial communities in unprecedented detail, yet ribosomal sequence divergence captures only a small fraction of the genomic diversity of bacteria, and their responses to environmental stress and drug treatment. "Selective sequencing" approaches that capture many variable regions of the bacterial genome allow this additional diversity to be analyzed, and offer new insight into how microbes evade antibiotics and develop resistance. An advantage of NGS is the capacity to split a sample into two, and assay DNA variant detection and variation in RNA level expression using the same sequencing assay.
The diversity of viral genomes presents a prime case for the use of sequencing approaches to catalogue natural variation that illuminates viral properties and identify emergence of new mutations under selective pressure. Quantitative sequencing that enables researchers to track previously unknown sequence variants during viral infection demonstrates a particular advantage of NGS over primer anchored real-time PCR approaches.
A concern for microbial sequencing experiments is how to deplete background-contaminating DNA such as human genomic DNA in stool or blood samples. Several approaches now exist to enable selective sequencing that enriches for target nucleic acids, ensuring that the majority of sequencing information corresponds to the target pathogen genome, rather than background nucleic acid. Such approaches will be critical for benchtop second- and third-generation sequencing platforms in which sequencing bandwidth has been reduced to lower the entry cost for small-scale users.
Integrating sequencing into existing experimental workflows
Another important consideration for investors in NGS approaches is, "how will the technology be used in the lab?"
The answer is multi-faceted and still evolving, but a number of options are available. In the case of current high bandwidth sequencers, individual users may be less keen to buy or even rent machines given the pace of development of new technologies and the cost of training personnel to use them. Service models and in-house core facilities predominate, both fully outsourced. Existing sample preparation protocols for NGS often requires labor-intensive optimization, and outsourcing of sample prep may be preferable.
However, combining an existing highly optimized experiment with a new technology off-site or allowing outside researchers on-site brings additional risks and sources of experimental delay. There is therefore a demand for simple-to-use assay kits that are customized for—and easily integrate with—existing experimental methods, allowing samples to be processed in-house and sent out to the sequencing provider of choice.
Many of the drawbacks of the outsourced sequencing model are overcome by rapid-turnaround benchtop machines. These units are expected to initially demonstrate their worth following up observations from larger scale second-generation sequencing experiments, in which sequence resolution is still required, but the target landscape has been narrowed. Establishment of these machines in the lab will be aided, however, by the development of assays customized for these platforms and accessible to molecular biologists without specialized training in NGS. In this case, there is greater onus placed on assay manufacturers to generate data that is readily interpretable for those not familiar with high-complexity sequencing datasets. Bioinformatics software already offers some NGS capable functionality, but the flexibility of the experimental applications of these platforms mean that there are as yet no convincing one-stop solutions for sequencer-to-notebook data processing. In the short term, it may be that researchers look to assay providers to provide custom software tools that plug-in to broader analysis platforms, offering a convenient division of labor as solution to meet customer data delivery needs.
The critical determinant in uptake of NGS for mainstream pharmaceutical research will be utility—technologies that provide new and meaningful insight with the data they return. Sequencing service providers must work hard, however, to convince industry research groups that sequencing assays provide new insights that advance their understanding of product biology, create opportunities to discover new avenues of exploitable biology or make significant savings in their existing research programs.
If NGS continues to meet expectations as it has in the past five years, benchtop sequencers and real-time PCR machines will be competing for researchers' affections in the coming years.
Graeme Doran is chief scientific officer for Pathogenica Inc. in Cambridge, Mass., which provides sequencing services and diagnostic assays using NGS technology. Doran received his Ph.D. from Oxford University and postdoctoral training at the Massachusetts Institute of Technology.
1. Zhang, et. al., 2003. "Millions of Years of Evolution Preserved: A Comprehensive Catalog of the Processed Pseudogenes in the Human Genome." Genome Res., 13:2541-2558.
2. Git, et al., 2010. "Systematic comparison of microarray profiling, real-time PCR, and NGS technologies for measuring differential microRNA expression." RNA, 16: 991-1006.