Biology is drowning in data and complexity

Post author:Erich Vieth
Post published:May 4, 2010
Post category:Science / scientific method / Technology
Post comments:2 Comments

In the April 2010 edition of Nature (available only to subscribers online), you can read a counter-intuitive story of illustrating that more information is sometimes add confusion, rather than making things simpler. Maybe another way of putting it is that the path to understanding can often take one through phases of disorientation resulting from new influx of accurate data. This particular story, by Erika Check Hayden, titled “Life Is Complicated,” considers what has happened in the field of biology subsequent to the Human Genome Project. Prior to the Project, many biologists guessed that the human genome contained about 100,000 genes that coded for proteins. At the conclusion of the project, however, we found out that only about 21,000 human genes code for proteins.

One might think that this would simplify the field of biology, especially since biologists now know what many of these genes are. Many people thought that we were going to have for ourselves a clearly understandable “blueprint,” of the human species. The opposite is happening, however: “It opened the door to a vast labyrinth of new questions.” What kinds of questions? This article really surprised me with the vast scope of new territory opened up by the Human Genome Project. It can be summed up by Hayden’s quote from biochemist Jennifer Doudna: “The more we know, the more we realize there is to know.”

Hayden explains that sequencing the genome undermined “the primacy of genes by unveiling a whole new classes of elements–sequences that make RNA or have a regulatory role without coding for proteins.” It turns out that “much non-coding DNA has a regulatory role “that we are just beginning to understand.” To illustrate how complex things have gotten, Hayden discusses what we’ve now learned about a single protein, “p53,” which for many years was simply known as a tumor suppressor protein. Consider what we know now:

In 1990, several labs found that p53 binds strictly to DNA to control transcription, supporting the traditional Jacob-Monod model of gene regulation. But as researchers broadened their understanding of gene regulation, they found more facets to p53 . . . [R]esearchers now know that p53 binds to thousands of sites in DNA, and some of the sites are thousands of base pairs away from any genes. It influences cell growth, death and structure and DNA repair. It also binds to numerous other proteins, which can modify its activity, and these protein-protein interactions can be tuned by the addition of chemical modifiers such as phosphates and methyl groups to create through a process known as alternative splicing. P53 can take nine different forms, each of which has its own activities and chemical modifiers. Biologists are now realizing that p53 is also involved in processes beyond cancer, such as fertility and very early embryonic development. In fact, it seems willfully ignorant to try to understand p53 on its own. Instead, biologists have shifted to studying the p53 network as depicted in cartoons containing boxes, circles and arrows meant to symbolize its maze of interactions.

Hayden reminds us that the p53 story is one of many similar stories in post genomic-era biology. She explains that we now know that many of the signaling pathways that we thought we were close to

Image by Flaps at Dreamstime.com (with permission)

understanding are not simple and linear but organized in vast complex networks that sometimes appear fractal. She quotes James Collins, a bio-engineer: “Kevin made the mistake of equating the gathering of information with a corresponding increase in insight and understanding.”

Here’s another counter-intuitive result of this new dilution of information: many of our models have gotten too complex to be useful.

In many cases the models themselves quickly become so complex that they are unlikely to reveal insights about the system, degenerating instead into mazes of interactions that are simply exercises in cataloging.

The genome project has made biologists into kids in a big candy store: a candy store with unending aisles and endlessly deep bins of dazzling, disorienting candy, much of which is currently out of our reach. Such is the horizon of new knowledge, equal parts frustrating and tantalizing.

Tags: biology, complexity

Erich Vieth

Erich Vieth is an attorney focusing on civil rights (including First Amendment), consumer law litigation and appellate practice. At this website often writes about censorship, corporate news media corruption and cognitive science. He is also a working musician, artist and a writer, having founded Dangerous Intersection in 2006. Erich lives in St. Louis, Missouri with his two daughters.

This Post Has 2 Comments

KennyCelican May 4, 2010 Reply

As a biologist and teacher, this isn't really surprising to me. When I was in school, my professors and classmates fell into two broad categories.

One group were the 'taxonomists', who were enthused with each new discovery, no matter how trivial. They were and are always invaluable for cataloguing and confirming the variety of life as it exists. However, they were often so enthralled with the differences between two instances that they could not see the commonalities.

The other group were the 'reductionists', who would synthesize generalizations about the data gathered. They were less enthused about data gathering, but would come up with hypotheses that predicted future data with varying degrees of accuracy. Unfortunately, they would occasionaly overlook details that didn't fit with their theories.

At best, it was a race, where the taxonomists tried to find something not covered by the generalizations, and the reductionists tried to find new explanation(s) that covered everything discovered so far. At worst, the taxonomists bogged down in the details, and the reductionists ignored the details entirely.

Honestly, this is very exciting stuff, but to me it's not surprising stuff. It's just a sign that the taxonomists have taken a big lead for the moment, giving the reductionists a big chunk of work to play with.
Niklaus Pfirsig May 5, 2010 Reply

Mapping the human genome is simply laying the foundation. The next step is to understand the biochemical mechanism by which the genetic information is interpreted into a person.

Those 21,000 protiens do not stand alone, but interact in myriad ways. This means that a gene associated with hazel eyes may also affect the strengh of the immune system.

In a recent study, dogs with black hair were found to be more aggressive than lighter colored dogs.

Erich Vieth

You Might Also Like

Boat floating on air

Are you as scientifically literate as an eighth grader?

Babies Here, There and Everywhere

This Post Has 2 Comments

Leave a Reply Cancel reply