Developing a DNA detector for life detection on Mars
Searching for life beyond Earth is a major element of NASA’s missions and activities, many of which have focused on Mars as a world where life could have once existed and may still survive. Life on Mars, if it exists, may share a common ancestry with life on Earth derived from meteoritic transfer of microbes between the planets. We are building an instrument to test this hypothesis in-situ on Mars by isolating, detecting, and sequencing nucleic acids (RNA or DNA), the building blocks of all known life. We aim to develop a fully automated compact, portable version of this instrument, the Search for Extra-Terrestrial Genomes (SETG). It is also possible that RNA or DNA-based life may have arisen independently beyond Earth, an idea supported by the potential universality of biochemistry and the identification of amino acids, metabolic precursors, and nucleobases within meteorites, and ribose precursors in interstellar space. By isolating, detecting, and sequencing nucleic acids, SETG can analyze the genome or gene expression of any DNA- or RNA-based organism on Earth, Mars, or beyond.
The potential for common ancestry of life on Earth and Mars
An intense period of impact events called the Late Heavy Bombardment (LHB) occurred around 4.1-3.8 Gya, likely as a result of the inward migration of the giant planets. These impacts generated significant meteoritic exchange between Earth and Mars25, around 100x higher flux from Mars to Earth than vice-versa7. In the late 1990s, a series of theoretical studies demonstrated that Martian meteorites were transferred to the Earth at shortened time scales and with higher fluxes than previously believed, delivering around one billion tons of meteoric debris, representing 7.5% of all Martian meteorites. Within this collection, numerous meteorites would have been delivered on time scales of decades to thousands of years. Several dozen SNC meteorites of Martian origin have been discovered here on Earth, and magnetic and thermochronological analyses indicate that 20% of Martian meteorites have only experienced mild heating (<100ºC, below sterilization temperatures) during ejection and impact. Recent studies confirm the ability of bacterial spores to survive hypervelocity impacts. Once life had evolved on one of the planets, the rate of material transfer makes it plausible that the adjacent planet could “catch” life rather than independently evolving it.
The search for RNA or DNA beyond Mars
Recent discoveries of nucleic acids or their precursors within meteorites and in interstellar space22 could steer the development of life towards these biomolecules. Thus, it makes sense to search for RNA or DNA-based life within potential habitable zones even outside the context of meteoritic exchange, such as the probable liquid water oceans beneath Europa and Enceladus and possibly Titan. Given the possibility of shared ancestry between life on Earth and Mars, if it exists, and the potential for RNA or DNA-based life elsewhere, searching for life as we do know it is a critical part of any comprehensive life detection approach. Strategies such as detection of organic molecules (amino acids, individual nucleobases), molecular chirality, putative metabolic activity, specific protein modules, while valuable, either lack specificity or sensitivity. In contrast, SETG is extremely sensitive (down to single molecule) but does not sacrifice specificity: there are no known natural abiological routes to RNA or DNA sequences of nontrivial length.
Sequence data has redefined what we know about the nature and extent of life39. In addition to revealing that all known life (with the possible exception of viruses) is descended from a common ancestor, such sequences have revealed entirely new high-level taxa. All known (non-viral) life forms share about 500 “universal genes” including the ribosomal RNAs, regions of which have changed very little over the past 3-4 billion years45. For example, within the ~1500 nucleotides of the 16S rRNA gene (18S in eukaryotes), there are multiple 15-20 nucleotide segments that are nearly identical in all known organisms46 because these regions are involved in regulating the genetic code47, the degenerate mapping of nucleobase triplets to amino acids. The ribosomal sequences are the gold standard for identifying and classifying diverse microbes. The centrality of these RNAs led to the idea that an RNA world may have preceded the DNA world, which provides an incentive for SETG to target both RNA and DNA.
SETG can target a specific gene or gene region between two known primer sequences using polymerase chain reaction or target all DNA or RNA (RNA is first reverse transcribed to DNA). Sequencing any DNA molecule is important for targeting putative nucleic acid-based Martian life, where the extent of ancestry is unknown. Sequence data can then be used to place an organism on the tree of life, identify its closest known relatives, or distinguish between closely related species, through an analysis of similarity to other sequences. For example, the Green Genes database contains >700,000 near full-length 16S ribosomal sequences.
Library Generation for Sequencing
Aside from single-molecule sequencing approaches, which are impractical for our application due to their complexity, sequencing requires generating thousands to millions of identical copies of the same DNA molecule. This requires having known sequences at the ends of the molecule that can be targeted, which is entirely straightforward when targeting a particular gene with primers. When targeting any DNA molecule, one must fragment the DNA to a desired size, add known ends, amplify to get a clonal product, and sequence.
Nucleic Acid Amplification and Detection
We originally focused on targeting the most highly conserved regions within genes shared by all known life: We analyzed whole genomes, and identified regions in the ribosomal 16S and 23S genes as the most conserved, followed by transfer RNAs14. Notably, these regions are all part of the RNA system that regulates the genetic code47. Our prototype amplification/detection module allowed us to amplify up to 8 samples on a microfluidic chip, with each sample amplified in 384 wells each of 1 nanoliter in volume. Average power during thermal cycling was ~25W. This design would facilitate selection of a particular clonal product to use for sequencing. We also developed short primers to precisely target highly conserved gene regions. In-silico and in-vitro assessment of these primers suggests they can amplify the vast majority of known organisms and can detect more biodiversity than longer primers55. However, no primer set was found to be truly universal, and the short primers suffer from poor PCR efficiency. Thus, the capability to amplify any RNA or DNA molecule is critical in order to detect potentially divergent Mars organisms that have been isolated from Earth organisms for >3.5 billion years. Fortunately, sequencing of RNA in microbes reveals that >90% of microbial RNA is ribosomal; thus we can directly target these highly conserved genes by sequencing RNA, and use these sequences to evaluate potential ancestral relationships.
Massively parallel sequencing
Until recently, sequencing instruments have been large, heavy, complex, and required specialized reagents and sensitive optics. For this reason, we had originally proposed to develop a single channel sequencer, which would have allowed us to keep the size small and use non-imaging optics to do pyrosequencing, at the expense of a very limited sequencing capability. However, massively parallel sequencing is now feasible for SETG, based on the technology commercialized by Ion Torrent (Fig.): a small standard semiconductor chip that enables concurrent sequencing in millions of wells, requires no imaging or optics, and is extremely small, fast, and robust. In addition, massively parallel sequencing enables us to dramatically simplify our earlier design, eliminating imaging optics and reducing fluidic complexity. Consider one well on the sequencing chip, occupied by a bead covered in a single clonal DNA molecule. When a matching nucleotide flows by, a polymerase enzyme sitting on the DNA will incorporates the nucleotide into the 3’ end of a growing double-stranded DNA molecule, releasing a hydrogen ion. When this happens concurrently on ~106 identical molecules, the resulting transient change in pH is detected as a change in the source voltage V. By flowing different nucleotides one by one, and looking for transients, the target sequence can be determined in each of the occupied wells. By fitting these transients to a model of nucleotide incorporation, each base can be called and scored for quality. We are collaborating with Ion Torrent founder Jonathan Rothberg , who has interest in space applications of sequencing. Rothberg is well known for founding 454, where he pioneered massively parallel sequencing. Specific contributions by Ion Torrent will include providing custom sequencing chips, and miniaturization of the supporting electronics. Ion Torrent technology is by far the most practical technology for space applications.
Our concept for SETG is an instrument 3 kg in mass with a volume a bit more than a typical laptop, peak power < 30W, and average power of ~10W during a run. We envision SETG and other life detection instruments utilizing a common sample collection system such as the one on the Mars Science Lab. For our integrated instrument concept (Fig. 6), a person would load a sample into a single-use cartridge, which would include all fluidics, freeze dried reagents, buffer, other small non-reusable components, and the sequencing chip. This cartridge would have electrical and pressure interfaces to a hardware module, which would be capable of operating autonomously, but could be controlled or monitored via a computer or smart phone. The hardware module itself will have little or no user interface other than power, data, and controls or interlocks for safety or convenience. The process will be completely automated once a user loads a sample, seals the sample inlet, and initiates a run.
Fig. Integrated instrument concept. All fluidics are contained within a single cartridge to prevent cross contamination between runs. To enable processing of multiple samples on Mars, this cartridge could be replicated or split to enable common use of some components: for example, multiple cartridges making library-ready DNA could be sequenced in a common sequencing module.