Scientists depend on advanced computing to better understand evolution, drug discovery and genetics

May 31, 2010

Today's medical technology can recognize tumors smaller than a fingernail, decode your DNA to predict future illness and even read a person's mind by identifying electronic patterns in the brain.

"Medical advances seem like wizardry," said Harold Varmus, former director of the National Institutes of Health (NIH). "But pull back the curtain, and sitting at the lever is a high-energy physicist, a combinatorial chemist or an engineer."

Our hope is that the cyberinfrastructure will speed up scientific progress. It will be a long-term effort and a community effort, but we're hoping to make an easier path for researchers. Dr. Dan Stanzione, TACC deputy director

Add to that list computational biologists, who use mathematical algorithms and supercomputers bigger than basketball courts to improve human health.

Supercomputers have traditionally been used for calculation-intensive tasks such as climate modeling, airplane design and simulations of nuclear explosions. But increasingly, life scientists — from physicians and plant biologists to medical engineers and biochemists — are using advanced computing to drive the scientific advances of the future.

The field of computational biology has grown significantly in the last few years at The University of Texas at Austin, in large part because of developments at the Texas Advanced Computing Center (TACC), one of a handful of academic centers in the country capable of providing the large-scale systems and expertise life scientists require to make breakthrough discoveries.

TACC enabled the first model of the H1N1 active protein and helped pioneer computer-driven operating practices by performing remote laser cancer surgery on a dog.

More recently, advanced computing has helped researchers at the university develop strategies to protect our bodies, ecosystems and food sources from disease and global catastrophe.

Coral Reefs — Evolve or Perish

Like the canary in the coal mine, corals are a bellwether for the health of the oceans, and what scientists see scares them.

As the oceans become more acidic, corals are rapidly dying. Studies indicate that large-scale coral die-offs are occurring more frequently than at any time in the last 11,000 years, a fact that Tom Goreau, president of the Global Coral Reef Alliance, calls "an underwater holocaust."

Coral reef in Truk lagoon, Micronesia
Patch reef in Truk lagoon, Micronesia. This reef is one of the most beautiful and the most threatened, both by climate and human activities.

This may alarm preservationists, but over their quarter-billion-year history, corals have dealt with dozens of climate shifts, although none nearly as fast as the shift happening now. Mikhail Matz, professor of integrative biology at The University of Texas at Austin, is betting coral will adapt again. This time, Matz intends to catch evolution in the act by identifying changes in coral DNA as they happen. In doing so, he hopes to add crucial information to our understanding of how mutations and adaptations occur.

"Corals have shown the potential to evolve, and this is high time for them to do it," said Matz. "Let's see how evolution works."

Only a few years ago, studying an organism like coral through its DNA would have been prohibitively expensive and time-consuming. With the emergence of next-generation gene sequencing devices, the cost of sequencing has dropped, allowing scientists to move beyond mice, worms and fruit flies — the traditional, or "model" organisms for genetic research — to study a variety of non-model creatures based on their genomes.

In 2009, Matz and his team sequenced the entire transcriptome of the common Pacific coral in less time than ever before, and at a fraction of the cost of previous efforts. The transcriptome is the set of RNA molecules that reflects the genes that are working at any given time, providing concise information relevant to the study of evolution. Matz's effort was among the first successful full-transcriptome sequences for a non-model organism.

"People are just now realizing what next-generation sequencing can do," said Matz.

Paradoxically, one of the problems with next-generation sequencers is that they produce too much data. Advanced computing systems such as TACC's Ranger supercomputer are becoming a requisite to make full sense of the information gene sequencers generate.

As one of the first researchers to combine next-generation sequencers and supercomputers to study evolution, Matz is establishing the methodology by which researchers will analyze novel species in the future. Meanwhile, he continues to analyze coral RNA results, waiting for the telltale signs of evolution and the emergence of a climate-change-tolerant coral.

"The old corals are dying, but it may be a natural part of evolving," said Matz. "Once we know how corals evolve, we can help them - or at least avoid standing in evolution's way."

Computing Better Drugs

Threats to human health are a moving target. Viruses evolve. New syndromes emerge. Unfortunately, the timeframe for creating targeted drugs is measured in decades, not days.

Computer prediction of a novel inhibitor binding to
 the docking site of a protein target
Computer prediction of a novel inhibitor binding to
 the docking site of a protein target (JNK). The protein is involved in many diseases, including cancer.

Pengyu Ren, assistant professor of biomedical engineering at the university, is among a growing chorus of experts who believe the methods used by the pharmaceutical industry to find new drugs are a failure.

The problem: unrealistic models necessitated by limited computing power.

"They're taking shortcuts, making approximations of physical models," said Ren.

"The promise of rapid, inexpensive computational drug discovery has thus far eluded scientists," Michael Gonzales, life sciences program director at TACC, said. "Pengyu's work is an excellent example of how current advances in computing power are enabling scientists to take a fundamentally different approach to virtual drug discovery."

Using the Ranger supercomputer, Ren and colleagues at the Texas Institute for Drug and Diagnostic Development embarked on an ambitious search for new ways to find useful molecules for medicine, a process called drug discovery. The work has focused on evaluating best practices and applying new methods to 200 proteins that have known drug compounds. Ren believes this methodological shoot-out will lead to a more effective approach to drug discovery that will be adopted by the pharmaceutical industry.

In the meantime, he is using Ranger to understand the relationship between the rigidity of a drug compound and its ability to bind to a target protein, and to search for inhibitors relevant to cancer and heart disease in collaboration with experimentalists at the university.

"The ultimate goal is to develop tools that guide drug discovery," said Ren. "If that works, it will significantly improve our ability to design drug candidates that are more potent with fewer side-effects."

Sowing Seeds for a Fertile Future

As the source of our food supply and breathable air, plants are crucial for human survival. Unfortunately, most experts agree the world's agricultural systems aren't productive enough to feed our projected population, or mitigate the impact of global warming.

A virtual arabidopsis
A virtual arabidopsis, the first plant to have its genome sequenced, colored according to the expression level of one of its 28,000 genes. The image was created using ePlant, a tool developed by Dr. Nicholas Provart from the University of Toronto. The tool will incorporated into iPlant’s cyberinfrastructure to improve plant bioinformatics research.

For that reason, plant and crop scientists are searching for ways to improve plant performance to address these imperatives. This requires a deeper understanding of how plant genes work under stress conditions, which genes determine specific characteristics and which plants carry favorable genes.

According to Dan Stanzione, co-principal investigator of the iPlant Collaborative and deputy director of TACC, cyberinfrastructure — the advanced computing systems, software tools, networks and storage architecture that make big science possible — will play a vital role in this mission.

In 2008, the National Science Foundation initiated a $50 million, five-year project, called the "iPlant Collaborative," to build the cyberinfrastructure that will allow plant scientists to tackle "grand challenge" questions in plant biology.

"iPlant is the first attempt at this scale to create a cyberinfrastructure that fills the gap between building a supercomputer and what scientists do in their labs," Stanzione said.

The dream of better living through improved plant genes is at least as old as scientist Gregor Mendel's 19th century experiments with pea plants. iPlant's tool kits, however, are fully rooted in the 21st century.

iPlant seeks to revolutionize the way plant scientists solve critical questions in much the same way Web-based tools like flickr and Wikipedia have transformed how people share and store resources and collaborate on knowledge-building and research.

iPlant announced in April the beta release of computational environments and software tools designed to help plant scientists make discoveries faster. These tools provide the first glimpse of the types of technologies that iPlant will integrate and distribute.

"Our hope is that the cyberinfrastructure will speed up scientific progress," said Stanzione. "It will be a long-term effort and a community effort, but we're hoping to make an easier path for researchers."

Through efforts to understand and protect coral ecosystems, to discover new drugs, and to improve crop species, researchers at the university are applying advanced computing to the life sciences in ways unimaginable just a few years ago. In the process, they are sowing the seeds for a safer, healthier and more sustainable future.

For more information, contact: By Aaron Dubrow
Texas Advanced Computing Center

On the banner: Natural fluorescence of Acropora millepora,
viewed
 under a dissecting microscope. Photo: M. Matz - J. Wiedenmann.