Wednesday, November 17, 2010

Potential Big Advancement In Low Cost Sequencing

I am amazed almost weekly with the pace of advancements in new technology, and particularly in sequencing technology.  It wasn't that long ago that the thinking was that to get to the magical $1,000 genome would take perhaps a decade or more.  But since then new advancements in platforms such as Ion Torrent's array based sequencer, Illumina's Genome Analyzer, Roche's 454, the ABI  family of sequencers ;and the new PacBio sequencer are advancing at such a rate that the cost is already approaching $10,000.

Yesterday this item hit the newswire:

http://www.kurzweilai.net/new-low-cost-rapid-method-for-reading-genomes-uses-recognition-tunneling

New low-cost, rapid method for reading genomes uses ‘recognition tunneling’
Biophysicist Stuart Lindsay, of the Biodesign Institute at Arizona State University, has demonstrated a technique that may lead to rapid, low cost reading of whole genomes, through recognition of the basic chemical units—the nucleotide bases that make up the DNA double helix.
An affordable technique for DNA sequencing would be a tremendous advance for medicine, allowing routine clinical genomic screening for diagnostic purposes; the design of a new generation of custom-fit pharmaceuticals; and even genomic tinkering to enhance cellular resistance to viral or bacterial infection.
Lindsay’s technique for reading the DNA code relies on a fundamental property of matter known as quantum tunneling, the flow of electrons is a tunneling current. Tunneling is confined to small distances—so small that a tunnel junction should be able to read one DNA base (there are four of them in the gentic code, A,T,C and G) at a time without interference from flanking bases. But the same sensitivity to distance means that vibrations of the DNA, or intervening water molecules, ruin the tunneling signal. So the Lindsay group has developed “recognition molecules” that “grab hold” of each base in turn, clutching the base against the electrodes that read out the signal. They call this new method “recognition tunneling.”
Their current paper in Nature Nanotechnology shows that single bases inside a DNA chain can indeed be read with tunneling, without interference from neighboring bases. Each base generates a distinct electronic signal, current spikes of a particular size and frequency that serve to identify each base. Surprisingly, the technique even recognizes a small chemical change that nature sometimes uses to fine-tune the expression of genes, the so called “epigenetic” code. While an individual’s genetic code is the same in every cell, the epigenetic code is tissue and cell specific and unlike the genome itself, the epigenome can respond to environmental changes during an individual’s life.

So this appears to be a promising method to get us to the $1,000 genome. Like many of these announcements, there is still work to be done:

Lindsay stresses much work remains to be done before the application of sequencing by recognition can become a clinical reality. “Right now, we can only read two or three bases as the tunneling probe drifts over them, and some bases are more accurately identified than others,” he says. However, the group expects this to improve as future generations of recognition molecules are synthesized.

It seems with all the new and promising technologies for sequencing being announced that the $1,000 genome is a given within the next 5 years. The bigger problem is making some use of this data. I am reminded of genotyping arrays and that market from 7 or 8 years ago when Sapio introduced Exemplar Analytics to assist in analyzing those markers for Genome Wide Association Studies as well as other types of analysis such as copy number variation, loss of heterozygosity, quantitative trait analysis, etc. We started with the 10K assay from Affymetrix and now we have assays with 2.5 million markers on them. It is clear that researchers in many ways are still trying to find the best way to work with this data 7 years later.

The sequencing data problem may be an order of magnitude harder problem to solve than it was with genotyping. Clearly the focus needs to be on coding region variations leading to alternate protein functions, but preliminary to getting there is having robust alignment and assembly algorithms. The development of these algorithms is a fertile field right now, so its an interesting area to keep a close eye on. Having all this sequence data will not be of much use unless we can make sense of it.

At Sapio we are able to implement the workflows around these processes in the LIMS quite quickly. In the near term we are seeing pharma and biotech doing targeted sequencing (amplicon detection) and targeted expression assays for drug and diagnostic development. We are keen to support that process not only from a sample management and sample processing standpoint, but also extending into assay data management, data mining and data analysis. This is really "beyond LIMS" in terms of what it offers, but it is what the market wants...a single solution for the realization of translational medicine.