Thinking Outside the Box: Using Computer Science Skills to Make Sense of the Biology of Life
Thinking Outside the Box: Using Computer Science Skills to Make Sense of the Biology of Life
After finishing my freshman year in the University of New Hampshire's (UNH) computer science program in May 2005, I felt on the fence about where to take my career. My introductory computer science classes had been challenging, rewarding, and fun. But at the same time, I couldn't quite imagine myself writing data structures for a living, or developing algorithms to speed up data transfer along thicker and stronger networking cables.
As I looked around the classroom at my peers, I realized that most of them spent their free time thinking about applying their newly-acquired computer science skills to their passions. One of my roommates, for example, was developing a music streaming server for our apartment; this was before commercial music streaming services became popular, and the only way to enjoy such a setup was to develop a custom one. While I liked coding and the logical aspects of computer programming, my gut was telling me I hadn't quite found my niche yet. In what field could my computer science skills make a difference while also providing me with challenging problems to solve?
As these questions percolated through my subconscious, I came across a brief article about a field called “bioinformatics.” The article mentioned that the computer science department at UNH would start offering a “bioinformatics track option” for undergraduates, and that interested students should contact Professor Phil Hatcher directly for more information. Other than that, the article did not go into much detail about what “bioinformatics” might entail.
When trying to define bioinformatics, one stumbles across many opinions, but based on my experiences, I’d describe it as an interdisciplinary field that applies quantitative data analysis skills to problems found in the life sciences. It draws talented people from many different fields: biology, mathematics, computer science, physics, genetics, and chemistry. Together the team derives novel, unique solutions to biological problems. My nineteen-year-old self, however, had a much more vague idea about the field. Since “bio” is in the name, it probably is something related to biology, right?
I reached out to Professor Hatcher and arranged to meet with him in person. Thirty minutes after entering his office, he had convinced me to “just try it,” became my new student advisor, and provided a modified list of courses for me to take. As the fall semester had already started, I would keep my current class schedule, but starting in the spring, I would take “Introduction to Genetics,” a class that is usually taken by biology and pre-med majors. Unknowingly, I had just taken the first steps on a path that would guide me through graduate school and into a career as a professional researcher.
Moving Beyond the Textbook to Look at Real Data
At UNH, my personal journey continued with a few more bioinformatics-track specific classes (advanced statistics, chemistry, an independent research project) as well as two undergraduate research projects. During the summer between my sophomore and junior year, I worked with Professors Bergeron (computer science) and Kelley Thomas (genetics) on a Summer Undergraduate Research Fellowship (SURF) funded project exploring small, repeating genetic patterns called “microsatellites” in a variety of different bacteria.
Microsatellites, sometimes also called “Short Tandem Repeats” (STR), are small stretches of DNA that are replicated a number of times. In humans, these patterns have been associated with Huntington’s and other neurological diseases, but they are also often used in forensic applications because they can serve as markers in DNA strands. In bacteria, microsatellites are less common, but they are believed to play a role in microbial evolution. They are also frequently used as markers to help trace relationships between organisms; closely related organisms have similar markers, whereas organisms that are not closely related show different patterns. While searching for repeating patterns in a stretch of DNA, using the naked eye is challenging; using a computer makes the task easier. Thus, I spent my summer embedded in a genetics laboratory, but rather than using microscopes and other traditional laboratory instruments, I developed software code that would allow me to compare the patterns found in several bacteria in an effective and systematic manner.
As it turned out, one of the biggest challenges that summer wasn’t learning a new programming language, but rather, communicating with biologists. Tasks that are second nature to a computer scientist might seem baffling to a biologist, but the reverse is true as well. I needed to learn enough about microbial genetics to be able to translate the biological questions into computer code and be able to then interpret the findings. So I spent my days reading textbooks and journal articles about microbial genetics, querying publicly available databases for information on microbial genomes and features, and asking many questions of the geneticists working in the lab. For the first time in my life, I felt that I was going beyond a standard textbook answer. I developed questions to which my advisors did not have the answers. Research, as I found out, was pushing the boundaries of human knowledge on a daily basis; I was hooked! The only logical conclusion to this summer research project was applying for another grant to be able to dive deeper into the field.
I continued exploring my interest in bioinformatics the year after when I traveled to Auckland, New Zealand on an International Research Opportunities Program (IROP) grant to study palindromic patterns in soil-associated microbes. A palindromic pattern is also a short stretch of DNA (the “motif”) that tends to be repeated in a given microbial genome, but it has an additional identifying feature: the DNA motif reads the same backwards and forwards, just like a palindromic word or sentence would. A common example is the English phrase “A man, a plan, a canal, Panama.” Palindromic DNA patterns have been found in virtually all organisms, and they help to determine protein structures and to shape DNA by guiding “restriction enzymes” to places where the molecule needs to be cut, rearranged, or glued back together. In bacteria, these mechanisms enable the evolution of new types of bacteria by allowing pieces of DNA from different origins to be combined. They are also useful in situations where bacteria have to defend themselves against invading viruses. Recently, CRISPR sequences, a bacterial defense system and a type of palindromic motif, have gained attention in the news for enabling scientists to modify genes in a precise and efficient manner. Early experiments showed that by using CRISPR, we can modify the genomes of animals in the lab. The hope is to some day apply these techniques to cure genetic diseases by modifying the genetic material at its source.
Apparently, hugging Kauri trees is for good luck; this one is over 800 years old and one of the sons of the great tree god of the Maori (indigenous people of New Zealand). Today, these trees are protected by laws; you can't chop them down anymore.
While in New Zealand, I spent my days in the bioinformatics department at the University of Auckland focusing on palindromic motifs of Pseudomonas bacteria. At the time, nothing was known about the nature of these patterns in this type of bacteria, and I compiled a catalogue of different kinds of patterns across a set of eleven bacteria species. During my free time, I got to know the city and my fellow researchers, both undergraduate and graduate students.
Back at UNH in the fall of 2007, the project I had worked on for the past few months formed the basis for my honors thesis in the computer science department, and firmly convinced me that I wanted to continue working in a research setting. In 2008, after I finished my bachelor's in computer science and became the first graduate of the bioinformatics track option, I moved to Boston University (BU) and joined the graduate program in bioinformatics to begin working on my Ph.D.
At BU, I built on the skills I had learned during my two summers of undergraduate research experience. Fascinated by the single microbes I had studied during my SURF and IROP projects, I wondered how these microbes lived together in communities, and if I could use my computational skills to shed light on their interactions. Fortunately, during my time at BU, the Human Microbiome Project was completing its first analysis phase, which made a wealth of data describing the microbes living in and on the human body available to anyone with an internet connection. For my doctoral thesis, I explored the metabolic capabilities of microbial communities living on different parts of the human body. The microbes that live in and on our human body help us digest food and fight invading pathogens, usually in exchange for a safe living environment. However, we are just now uncovering how many different microbes live on our bodies and how they relate to our daily well-being. Learning about their abilities will help paint a complete picture of human-microbial interactions.
After receiving my doctorate in early 2014, I spent a year working at the New York Genome Center in Manhattan, developing parts of their core microbiome analysis system designed to efficiently process large numbers of patient samples. I have since moved on to the microbiology department at the Forsyth Institute in Cambridge, Massachusetts, researching the microbes related to oral health. On any given day, you can find me at my desk analyzing data from the in-house genetic sequencing center and generating reports describing the findings, consulting with both internal and external biologists on how to best answer their research questions using computational tools, or implementing new tools and analysis methods. Since my computational expertise is available to the whole Institute, I get broad exposure to different kinds of projects; recently, I’ve been involved in research efforts focusing on early childhood tooth decay (“caries”) development, the oral health of teenagers born to HIV-positive mothers, and the relationship between bacteria living in the human mouth and the inflammatory response in patients with head and neck cancers.
When I remember the young freshman computer science student looking for her niche in the world, I realize that my SURF and IROP experiences couldn’t have come at a better time. They put me on a route leading me to a job I can get excited about every day. Doing research ensures that you will never get bored: there are always more questions that need to be answered, or more angles that can be explored.
Science thrives on collaborations and teamwork, and similarly, I wouldn’t be where I am now if it weren’t for the support I received from many different people over the years. I am grateful to my parents, who nudged me to take that first computer science class in high school. I couldn’t have done it without my undergraduate advisor, Professor Hatcher, who initially suggested I look to biology for interesting research questions. Georgeann Murphy and the staff at the Hamel Center for Undergraduate Research were incredibly supportive and patient when it came to writing and applying for my undergraduate research grants. I am also grateful to my mentors at UNH and abroad who donated their time and wisdom to help me define my research questions and chase after the answers. Finally, my thanks go to the grant donors, who, in their generosity, enabled my first forays into research.
You may read Lina Faller’s research article in Inquiry ’07.
Lina L. Faller is a pragmatist, who loves seeing her skills applied to real life problems. A computer science major from Wakefield, MA, Lina was surprised to discover that as she worked on her Honors thesis, even the biology of bacteria could hold her curiosity, if it mattered to the real world. Her work began around a desire to apply her computer science skills to real-life biology questions, forcing Lina to adapt her skills beyond the classroom. She found the process very satisfying, but realized that "with research, you are never done! There is always another question that can be asked." Her dedication and passion for research has paid off for Lina; since graduating with her bachelors of computer science with a bioinformatics option in 2008, she has been building her career as a full-time researcher and is currently working at The Forsyth Institute as a Bioinformatics Analyst. "I wouldn’t have chosen this career path if I hadn’t been exposed to research early on," she explains.
Copyright 2016, Lina L. Faller