Wednesday, November 17, 2010

Potential Big Advancement In Low Cost Sequencing

I am amazed almost weekly with the pace of advancements in new technology, and particularly in sequencing technology.  It wasn't that long ago that the thinking was that to get to the magical $1,000 genome would take perhaps a decade or more.  But since then new advancements in platforms such as Ion Torrent's array based sequencer, Illumina's Genome Analyzer, Roche's 454, the ABI  family of sequencers ;and the new PacBio sequencer are advancing at such a rate that the cost is already approaching $10,000.

Yesterday this item hit the newswire:

New low-cost, rapid method for reading genomes uses ‘recognition tunneling’
Biophysicist Stuart Lindsay, of the Biodesign Institute at Arizona State University, has demonstrated a technique that may lead to rapid, low cost reading of whole genomes, through recognition of the basic chemical units—the nucleotide bases that make up the DNA double helix.
An affordable technique for DNA sequencing would be a tremendous advance for medicine, allowing routine clinical genomic screening for diagnostic purposes; the design of a new generation of custom-fit pharmaceuticals; and even genomic tinkering to enhance cellular resistance to viral or bacterial infection.
Lindsay’s technique for reading the DNA code relies on a fundamental property of matter known as quantum tunneling, the flow of electrons is a tunneling current. Tunneling is confined to small distances—so small that a tunnel junction should be able to read one DNA base (there are four of them in the gentic code, A,T,C and G) at a time without interference from flanking bases. But the same sensitivity to distance means that vibrations of the DNA, or intervening water molecules, ruin the tunneling signal. So the Lindsay group has developed “recognition molecules” that “grab hold” of each base in turn, clutching the base against the electrodes that read out the signal. They call this new method “recognition tunneling.”
Their current paper in Nature Nanotechnology shows that single bases inside a DNA chain can indeed be read with tunneling, without interference from neighboring bases. Each base generates a distinct electronic signal, current spikes of a particular size and frequency that serve to identify each base. Surprisingly, the technique even recognizes a small chemical change that nature sometimes uses to fine-tune the expression of genes, the so called “epigenetic” code. While an individual’s genetic code is the same in every cell, the epigenetic code is tissue and cell specific and unlike the genome itself, the epigenome can respond to environmental changes during an individual’s life.

So this appears to be a promising method to get us to the $1,000 genome. Like many of these announcements, there is still work to be done:

Lindsay stresses much work remains to be done before the application of sequencing by recognition can become a clinical reality. “Right now, we can only read two or three bases as the tunneling probe drifts over them, and some bases are more accurately identified than others,” he says. However, the group expects this to improve as future generations of recognition molecules are synthesized.

It seems with all the new and promising technologies for sequencing being announced that the $1,000 genome is a given within the next 5 years. The bigger problem is making some use of this data. I am reminded of genotyping arrays and that market from 7 or 8 years ago when Sapio introduced Exemplar Analytics to assist in analyzing those markers for Genome Wide Association Studies as well as other types of analysis such as copy number variation, loss of heterozygosity, quantitative trait analysis, etc. We started with the 10K assay from Affymetrix and now we have assays with 2.5 million markers on them. It is clear that researchers in many ways are still trying to find the best way to work with this data 7 years later.

The sequencing data problem may be an order of magnitude harder problem to solve than it was with genotyping. Clearly the focus needs to be on coding region variations leading to alternate protein functions, but preliminary to getting there is having robust alignment and assembly algorithms. The development of these algorithms is a fertile field right now, so its an interesting area to keep a close eye on. Having all this sequence data will not be of much use unless we can make sense of it.

At Sapio we are able to implement the workflows around these processes in the LIMS quite quickly. In the near term we are seeing pharma and biotech doing targeted sequencing (amplicon detection) and targeted expression assays for drug and diagnostic development. We are keen to support that process not only from a sample management and sample processing standpoint, but also extending into assay data management, data mining and data analysis. This is really "beyond LIMS" in terms of what it offers, but it is what the market wants...a single solution for the realization of translational medicine.

Wednesday, July 28, 2010

So what is a LIMS?

First, I want to apologize for the long gap between posts.  This year has turned out to be by far the busiest year in our history, perhaps because of all the pent up demand from last years recession.  I do hope to post fairly regularly as the LIMS topic needs more discussion and we want this blog to serve as a basis for at least some of that dialog.   Now back to the topic!

LIMS, of course, is an acronym for Laboratory Information Management System.  So thinking in very straightforward terms, a LIMS can be any piece of software that manages the information that is produced or digested in a laboratory setting.  Unfortunately, this definition is not very good.  Using this type of simple definition means that Microsoft Excel would be considered a LIMS!  We certainly know that many labs are using it for tracking information in the lab, but I doubt any of them would call it a LIMS.

So we need to improve our definition a bit to exclude things like MS Excel from qualifying as a LIMS.  Perhaps we can say something like: A LIMS is any piece of software that manages the information that is produced or digested in a laboratory setting, and includes support for many laboratory-specific needs."  I find this definition also lacking, especially in the context of our own product, Exemplar LIMS.  This type of definition potentially excludes products like Exemplar LIMS that are general data management solutions with high affinity for laboratory environments. In this new definition it comes down to what does the term "support for many laboratory-specific needs" mean exactly?

When we think of a laboratory there are a few foundational requirements that seem to apply across all laboratory types, such as the ability to track and manage samples.  But even here I see problems.  When we hear this item as a requested feature in a LIMS, and then get down to the details of what the customer wants, it often varies quite dramatically between labs.  This is even true of labs of essentially the same type, such as next gen sequencing labs.  I think that this points to the crux of the problem with defining what a LIMS is, and why you get 5 different definitions if you ask 5 different people what LIMS is.  The problem, in our view, is that the laboratory setting and its needs vary widely between labs.  So what a LIMS is varies significantly depending on who you talk to as their needs reflect their vision of what a lab is.

There is certainly variance in software needs in other domains as well, but I think this is particularly true in the LIMS space.  If you were to plot the idea of what an accounting application should look like on a standard deviation graph, it curve would be very high in the middle and not very wide as most customer requirements would fall near the norm of requirements for other businesses.  In the case of the LIMS, the center of the graph would be low and the graph as a whole would be wider reflecting the wide divergence of needs of different lab environments.  So the standard deviation of the LIMS is essentially greater than for many other products.

Things are complicated by the fact that there are many different types of labs like forensics labs, environmental  labs, genetics labs, chemistry oriented labs, manufacturing labs, etc.  There are also distinctions such as regulated labs such as HIPAA, FDA, CLIA, GLP, etc and non regulated labs such as research labs.  There is also a diversity of lab equipment such as genetics based labs, blood chemistry focused labs, proteomic labs, etc.  When you group all these variables together you get a huge range of needs for a LIMS depending on the type of lab and the people running the lab.

So maybe we need to try another way to define a LIMS.  Perhaps we need to take a step back and look at what might be a common functional definition across all laboratories.  Here at Sapio Sciences it has become clear to us that all labs have one thing absolutely in common, and that is that each and every lab is an advanced workflow processing construct.

Think about what happens in many a laboratory every day.  Samples are received and are processed according to some protocol.  They may be accessioned, bar-coded, and stored in a freezer, or they may follow some other process, but the key is that there *is* a defined process that the sample will follow as laid out (usually) by the lab manager.  The sample will then proceed through some lab test or process that again follows a detailed and rigorous procedure usually established by the equipment vendor.  Then there will be some process for distributing test results to researchers or handing off data to some analysis tools, etc.  The key here is that no matter what kind of lab you are in, you are following several distinct processes as your main function.

So now we can perhaps begin a new definition of a LIMS in this new context.  essentially the LIMS should be able to support these processes in an intuitive and easy manner.  But this does not cover the fact that these data tracking needs can vary widely even for the same processes.  This means that there is an implicit need for flexibility of the LIMS which is reflected in our above discussion of large standard deviation of lab needs.  So somehow we need to incorporate this into our ideal LIMS definition as well.  There is also the need to address regulated versus R&D labs various needs.  Something else to consider is what happens after the lab processing is done.  There usually  is some hand-off of data to some consumer of that data, so we need to address that need as well.

It may not be possible to put this in a sentence, so I will try it in bulletized fashion.  I believe we should start to define a LIMS as:

A software tool that enables the following key features in support of modern laboratory operations:

  • Support for the rapid implementation of detailed and adaptable workflows
  • Inherent underlying flexibility in its architecture to support diverse data tracking needs
  • Built in features and functions that support its use in regulated environments
  • Easy to implement interfaces to support data import and export 
So there you have it!  We think this serves as an excellent definition of a LIMS based on our ample experience in this area.  We look forward to others feedback on this topic as it is sure to generate discussion.

Friday, January 15, 2010

Sapio's First Blog

From Kevin Cramer, VP of Sales at Sapio Sciences

We here at Sapio have been quite focused on creating the most configurable and functional laboratory management software (LIMS) product in the world over the last 5 years. This has certainly kept all of us busy...busy enough that taking time to Blog, Tweet or FaceBook about Sapio Sciences, Exemplar LIMS (our product), and the LIMS market in general has not been high on the priority list.

Exemplar LIMS has finally become the product we envisioned so long ago, and with the calendar turning to 2010, we have made a resolution to start to provide regular informational posts about all things LIMS. If you have not heard of us before, we have been selling Exemplar LIMS for over two years now and have been quite successful, but our complete vision of the product has only been fulfilled over the last year, and will culminate with our 4.0 release due in Q1 of 2010. 

We will not only be blogging about things related to our own product, but to the LIMS market in general. I am often amazed at how little is known about what a LIMS actually is by people who contact us looking to buy one. Naturally there are those well versed in the trials and tribulations of LIMS software too, but these represent a surprisingly small proportion of those we deal with.  So most of what we post here will be to enlighten and educate those would be LIMS purchasers and users. We will start with the basics and go deeper from there.

We will also talk about the how/what/why of buying a LIMS...including what you should look for, etc. My own 20+ years experience in software indicates that people think about the wrong things when buying software.  This is not the fault of the buyer, but more an issue with software vendors who shape their presentations and marketing to elicit an emotional response and thereby obscure important points. This is what software marketing is all about. 

I can hear you now: "But aren't you one of those very software vendors yourself?"  Yes we are, but I believe that we take a very different approach to running our business, developiong our software and presenting Exemplar LIMS than any other vendor on the market.  We are aiming to be as agnostic about our posts as possible when talking about LIMS in general and whats important.  Also, these blogs are open to comment so any reader can also freely let you know if we are off track on our message here.

So this ends our first post. Stay tuned for regular updates from us...and please feel free to provide feedback comments.