Back To: Home



Data management that speeds—rather than impedes—advanced DNA sequencing applications
December 2011
by Bruce Pharr, GenoLogics  |  Email the author

In the post-Human Genome Project era, molecular diagnostics and therapeutics, which promise to significantly improve patient outcomes by linking genetic diagnoses to targeted therapies, have been hailed as the next great advancement in human healthcare. Gene sequencing technology is widely acknowledged as the primary driver for these advances, powering both the research needed to understand and leverage individual genome maps and the clinical testing that will enable physicians to diagnose and treat disease based on a patient's genomic profile.
In this brave new world, it's hard to conceive that something as mundane as the management of sequencing data could determine which technologies succeed or fail. Yet today, sample management is the primary bottleneck to sequencing workflows. In May 2010, a survey of laboratory directors conducted by J.P. Morgan cited data management as one of the biggest hurdles to expanding next-generation sequencing (NGS). And in January 2011, a survey by William Blair & Co. cited data management software support as a top priority in choosing an NGS platform, trailing only instrument throughput and reagent cost.   
The ability to positively identify samples and maintain data integrity throughout the sequencing workflow will only become more vital as research labs aim to maximize throughput and extend research capabilities into clinical applications. If information management systems are to support the continued evolution of sequencing and its application to medical diagnostics and therapeutics, they must support the de-facto standards and best practices associated with using sequencers—the technology without which there would be no evolution.  
But sequencing isn't just about machines. Instruments couldn't run and work wouldn't get done without people. Informatics, in the form of laboratory information management software, can help improve lab efficiency, ensure that labs deliver better quality results to customers faster and enforce pertinent clinical and U.S. Food and Drug Administration (FDA) regulatory requirements such as CLIA, CAP/ISO 15189 and 21 CFR Part 11. But informatics can also inhibit lab staff by forcing them into unnatural or illogical ways of working. 
Informatics for modern sequencing labs must therefore interact effectively with three very different constituents: the instruments that perform the sequencing, the lab technicians who run the instruments and the lab directors who are ultimately responsible for the output and quality of the lab's work.
Let sequencers lead the way  
Sequencers are the core technology driving next-generation genomics research. The three major manufacturers of sequencing instrumentation (Illumina, Life Technologies and Roche) are all developing instruments capable of producing hundreds of gigabases of sequencing data per run. It's therefore imperative that data management software keep pace. Software can effectively integrate with sequencers in four ways.  
1)    Conform to the wet-lab protocols provided by a vendor. The continued raising of the throughput bar has led instrument manufacturers to standardize the wet-lab processes that will work best with their system into various sample-prep kits. Preconfigured informatics workflows that map to these vendor-specified procedures enable labs to thoroughly track sample-preparation activities.
2)    Ensure that samples are properly prepared to run on designated instruments. Instrumentation vendors have developed specific standards for the media onto which libraries are loaded for sequencing. Preconfigured informatics workflows can speed sample preparation by automating routine tasks such as tracking and loading concentrations, calculating dilution of libraries to normalized concentrations and tracking reagents.
3)    Demystify demultiplexing. Multiplexing, or library pooling, can increase sample throughput. But two things limit the technique's utility: Scientists must be able to rapidly organize prepared libraries or samples that can be effectively pooled together, while also tracking what happens to individual samples in a pool before, during and after multiplexing. Preconfigured informatics workflows can track a range of information associated with pending samples that technicians can search to build pooled libraries. Preconfigured informatics workflows can also support the assignment of adapters, indexes or DNA barcodes to individual samples in a pool to keep them distinct and trackable when multiplexed. 
4)    Run, monitor and track sequencing runs. Informatics can automate and track a variety of tasks to make sequencing more efficient, such as matching items sent to sequencers with samples in the data management systems; generating necessary files (such as run definition files and sample sheets) to communicate with the sequencer before and after sequencing; monitoring run status directly across multiple instruments; capturing key run parameter files and primary analysis metrics; and automating demultiplexing and conversion of raw data files from the sequencer into FASTQ format for analysis.
Help lab techs work better and faster  
Laboratory technicians, who interact most closely with instruments on a daily basis, will appreciate tight integration between instrumentation and informatics—yet such integration isn't all that technicians require from modern sequencing data management systems. Sequencing work is fast paced and dynamic—labs can generate hundreds of gigabases a day, and workflows may change monthly to accommodate new protocols and instrumentation. In this environment, labs succeed by pushing the boundaries of innovation—and they cannot afford to be constrained in their vision by the software they implement to manage data and workflows.
Lab technicians are most interested in ways to optimize their personal and team efficiency while minimizing the amount of time they need to spend on routine, repetitive tasks. Most technicians want to spend as little time as possible recording information; instead of telling a system they've done something, they'd prefer the system anticipate the task and supply as much information as possible to complement work they plan to do.  
Technicians also need fast and easy ways to track their work and the work going on around them. Dashboard views offer an ideal way for technicians to review experiments in progress, guide samples effectively through complicated workflows and collect and organize samples into multiplexed experiments to achieve greater efficiency. No matter how the interface is designed, technicians require uncluttered and streamlined access to only the information they need to initiate experiments, find samples on which to work, monitor work in progress and stay informed about other work occurring in their labs.
Empower lab directors to improve lab efficiency   
The overall operation and administration of labs falls on lab directors. This means that unlike lab technicians, who need ways to streamline day-to-day tasks associated with preparing and managing samples and running projects, lab directors require high-level views that they can use to track lab progress and verify that work is occurring and being recorded promptly, accurately, proficiently and in compliance with applicable regulatory requirements for clinical research and biopharmaceutical applications.
Most lab directors rightly put their primary emphasis on delivering high-quality results to clients and collaborators quickly. Data management software can centralize up-to-date information on runs so that lab directors can compare sequencing performance and trend accumulated data over time. When data from multiple runs are archived and searchable, labs can make better, more informed decisions about which samples to rework, whether to request more samples for further experimentation or how much time to spend on further analysis.  
Regulations will become more of a factor for labs that undertake clinical applications for sequencing. Three regulatory requirements potentially impact clinical genomics labs in the United States:
  • CLIA: Codified in the Code of Federal Regulations (CFR); Title 42, Public Health; Part 493, Laboratory Requirements: The Clinical Laboratory Improvement Amendments (CLIA) regulate all laboratory testing (except research) performed on humans in the United States by ensuring the accuracy, reliability and timeliness of patient test results, regardless of where the test was performed. Practically, data management software can support CLIA compliance by positively identifying and maintaining the integrity of samples from the time of receipt through the completion of testing and reporting of results.
  • CAP/ISO 15189: The College of American Pathologists (CAP) offers a lab accreditation program based on the International Organization for Standardization (ISO) 15189:2007, which utilizes specific lab accreditation criteria, procedures and processes to determine laboratory technical competence. Both programs focus on the continuum of care directly connected with improved patient safety and risk reduction and outline standards for quality and competence particular to medical laboratories in developing their quality management systems and assessing their own competence. To support CAP/ISO 15189, software must control the authorization and authentication of personnel that access sample and test data and the integrity of sample and test data (including its creation, modification, maintenance and transmission). It must also maintain audit trails that enable labs to identify individuals who have entered or modified data, files or programs and document modifications in a time-sequenced, trackable manner. 
  • 21 CFR part 11: The FDA codified regulations in the Code of Federal Regulations (CFR); Title 21, Food and Drugs; Part 11, Electronic Records; Electronic Signatures (ERES) that provide criteria for acceptance by FDA, under certain circumstances, of electronic records, electronic signatures and handwritten signatures executed to electronic records as equivalent to paper records and handwritten signatures executed on paper. While 21 CFR Part 11 only applies to FDA-regulated processes and submissions (and not, consequently, to clinical work), many organizations have adopted the regulations as a de-facto standard for managing any electronic records.
The data volumes produced by modern sequencing applications require new approaches to data management that center on the workflows prescribed by sequencers and the specific needs of two different types of users: lab technicians and lab directors. From my perspective, how labs choose to manage their data may very well determine which have the most success in applying DNA sequencing to the development of advanced molecular diagnostics and therapeutics.  
Bruce Pharr is vice president of products and marketing at GenoLogics Life Sciences Software. He has more than 25 years of experience in technology product design, management and marketing, including corporate and consulting roles with life science R&D hardware and software, pharmaceutical and medical device companies.  He holds a B.S. degree in economics and business administration, and he has completed executive programs in strategic marketing management and marketing strategy for technology-based companies at the Stanford Graduate School of Business.



Published by Old River Publications LLC
19035 Old Detroit Road
Rocky River, OH USA 44116
Ph: 440-331-6600  |  Fax: 440-331-7563
© Copyright 2017 Old River Publications LLC. All righs reserved.  |  Web site managed and designed by OffWhite.