Biology is drowning in data and complexity
In the April 2010 edition of Nature (available only to subscribers online), you can read a counter-intuitive story of illustrating that more information is sometimes add confusion, rather than making things simpler. Maybe another way of putting it is that the path to understanding can often take one through phases of disorientation resulting from new influx of accurate data. This particular story, by Erika Check Hayden, titled “Life Is Complicated,” considers what has happened in the field of biology subsequent to the Human Genome Project. Prior to the Project, many biologists guessed that the human genome contained about 100,000 genes that coded for proteins. At the conclusion of the project, however, we found out that only about 21,000 human genes code for proteins.
One might think that this would simplify the field of biology, especially since biologists now know what many of these genes are. Many people thought that we were going to have for ourselves a clearly understandable “blueprint,” of the human species. The opposite is happening, however: “It opened the door to a vast labyrinth of new questions.” What kinds of questions? This article really surprised me with the vast scope of new territory opened up by the Human Genome Project. It can be summed up by Hayden’s quote from biochemist Jennifer Doudna: “The more we know, the more we realize there is to know.”
Hayden explains that sequencing the genome undermined “the primacy of genes by unveiling a whole new classes of elements–sequences that make RNA or have a regulatory role without coding for proteins.” It turns out that “much non-coding DNA has a regulatory role “that we are just beginning to understand.” To illustrate how complex things have gotten, Hayden discusses what we’ve now learned about a single protein, “p53,” which for many years was simply known as a tumor suppressor protein. Consider what we know now:
In 1990, several labs found that p53 binds strictly to DNA to control transcription, supporting the traditional Jacob-Monod model of gene regulation. But as researchers broadened their understanding of gene regulation, they found more facets to p53 . . . [R]esearchers now know that p53 binds to thousands of sites in DNA, and some of the sites are thousands of base pairs away from any genes. It influences cell growth, death and structure and DNA repair. It also binds to numerous other proteins, which can modify its activity, and these protein-protein interactions can be tuned by the addition of chemical modifiers such as phosphates and methyl groups to create through a process known as alternative splicing. P53 can take nine different forms, each of which has its own activities and chemical modifiers. Biologists are now realizing that p53 is also involved in processes beyond cancer, such as fertility and very early embryonic development. In fact, it seems willfully ignorant to try to understand p53 on its own. Instead, biologists have shifted to studying the p53 network as depicted in cartoons containing boxes, circles and arrows meant to symbolize its maze of interactions.
Hayden reminds us that the p53 story is one of many similar stories in post genomic-era biology. She explains that we now know that many of the signaling pathways that we thought we were close to understanding are not simple and linear but organized in vast complex networks that sometimes appear fractal. She quotes James Collins, a bio-engineer: “Kevin made the mistake of equating the gathering of information with a corresponding increase in insight and understanding.”
Here’s another counter-intuitive result of this new dilution of information: many of our models have gotten too complex to be useful.
In many cases the models themselves quickly become so complex that they are unlikely to reveal insights about the system, degenerating instead into mazes of interactions that are simply exercises in cataloging.
The genome project has made biologists into kids in a big candy store: a candy store with unending aisles and endlessly deep bins of dazzling, disorienting candy, much of which is currently out of our reach. Such is the horizon of new knowledge, equal parts frustrating and tantalizing.
Copy and paste this URL into your WordPress site to embed
Copy and paste this code into your site to embed