The Speed They Need

Thursday, January 31, 2013

student utilizing new research technology

Editor’s note: Supported by a $15.4 million award from the National Institutes of Health, the New Hampshire IDeA Network of Biomedical Research Excellence (NH INBRE) aims to increase the state’s research capacity and the scientific knowledge of its workforce. Leading the coalition is Dartmouth, which oversees the awarding of NH INBRE grants and provides technical support and mentoring to partners in the coalition, including UNH. This is part two of a three-part series that profiles UNH faculty and students who are involved in NH INBRE.

These days, DNA sequencing isn’t just for geneticists. UNH’s Win Watson is a zoologist who wants to better understand the internal clocks that control the rhythmic activities of animals such as the horseshoe crab, or Limulus polyphemus. To find the molecules that constitute this animal’s clock, it’s helpful to know its genome, or complete genetic content.

In the past, it would be months, maybe years of lab work before scientists could get a fraction of the data used in projects such as Watson’s, but with next-generation sequencing technologies scientists can have this data now, without all the lab work.

An advanced laboratory instrument can sequence DNA from the horseshoe crab, translating it into four basic units represented by the letters A, T, G and C. Watson and his collaborators then use analytical techniques – bioinformatics – to glean meaningful genetic information from the strings of letters that the sequencer identifies.

Bioinformatics has become increasingly important as biologists in fields ranging from evolution to ecology use DNA sequencing to study the world around them. “It doesn’t matter what area of biology you’re in, bioinformatics is a core skill,” says W. Kelley Thomas, who leads NH INBRE’s bioinformatics core facility at UNH.

It’s a skill that’s relatively new, however. “DNA sequencing technology represents a real opportunity to move very quickly and cheaply, but it’s happened so fast that it’s been faster than the span of a scientist’s career,” Thomas says. “None of us my age were really trained to do this in school.”

The UNH core facility provides bioinformatics training and consultation to students and faculty at schools across the state that are working on NH INBRE projects involving DNA sequencing. (The other bioinformatics core is at Dartmouth’s Geisel School of Medicine under the direction of genetics professor Jason Moore, who holds a joint appointment at UNH.) The effort got a boost last year when UNH received a National Science Foundation grant that enabled it to buy a new state-of-the-art DNA sequencer for its Hubbard Center for Genome Studies. The sequencer supports a range of NH INBRE research, from investigations of invertebrate biological rhythms with applications to human health to a study of how microbial communities respond to contaminated sediment in Great Bay Estuary.

“The projects are based on the sequencing data that we then analyze,” says Kazu Okamoto, a graduate student in biochemistry who teaches bioinformatics workshops. “There would be no project without the sequencing data. Of course, there is a drawback to this because the data is vast. That’s what the workshops are for. The students come in baffled by these huge sequences of data; they’re almost scared. And then they leave confident. I love that.”

Less than a decade ago, it could take two or three years to identify a set of proteins by cloning fragments of DNA into bacteria. Now, using UNH’s Illumina HiSeq 2000, it takes about two weeks – and less money.

Thomas compares the methods to using a library’s card catalogue to look up information versus doing a Google search. With the latter, “you take a lot of the steps out. You make things very direct. Both involve a lot of information; the difference is in how you sort through it.”

Today, the sorting process uses a computer program to make sense of massive amounts of data that can run to hundreds of gigabytes. CLC Genomics Workbench assembles short, overlapping pieces of DNA into much longer sequences that encode genes, Okamoto says. From there, it’s a matter of looking for proteins shared with other organisms whose genomes have already been sequenced. To find a horseshoe crab gene coding for a clock protein, for instance, researchers look for DNA regions that are similar to the known sequence for a fruit fly clock protein. It’s comparable to searching for a word in an electronic document, though the fruit fly gene isn’t going to match the horseshoe crab’s gene exactly.

“This is the stepping stone for figuring out the physical aspect of the animal,” Okamoto says.