The Genom of Homo sapiens.pdf
The Genom of Homo sapiens.pdf
The Genom of Homo sapiens.pdf
- TAGS
- homo
- www.yumpu.com
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Systems Approaches Applied to the Study <strong>of</strong>Saccharomyces cerevisiae and Halobacterium sp.A.D. WESTON,* N.S. BALIGA,* R. BONNEAU,* AND L. HOODInstitute for Systems Biology, Seattle, Washington 98103-8904Integrative systems approaches to studying biologicalsystems have begun to yield striking results. Systems biologyhas emerged as a powerful new approach over thepast 5 years because <strong>of</strong> (1) the completion <strong>of</strong> the humanand many other genome sequences, which led to the identificationand prediction <strong>of</strong> comprehensive gene lists; (2)the development <strong>of</strong> high-throughput techniques for genomics,proteomics, metabolomics, and phenomics, leadingto the acquisition <strong>of</strong> global data sets; and (3) the creation<strong>of</strong> powerful computational methods for storing andassessing different types <strong>of</strong> global data sets as well as analyzingand integrating them. <strong>The</strong> integration <strong>of</strong> differentdata types is essential because it helps to deal with noisethat is inherent in large data sets and it reveals new biologicalphenomena that are not obvious from the analysis<strong>of</strong> single data types. Here we summarize the initial applications<strong>of</strong> systems biology to two systems: one, whichhas been studied intensively for years (the galactose utilizationsystem in yeast), and a second (the oxygen andlight responses in Halobacterium sp.) for which little dataare available. Systems approaches have allowed us togain new and fundamental insights into both systems(Ideker et al. 2001; Baliga et al. 2002).A key aspect <strong>of</strong> systems biology is viewing biology asan informational science. First, there are two generaltypes <strong>of</strong> biological information: the digital information <strong>of</strong>the genome, and environmental information that interactsdirectly or indirectly with the digital genomic information.Second, the genome has two major types <strong>of</strong> digitalinformation: the genes that encode protein and RNAmolecular machines, and the transcription factor-bindingsites in cis control regions <strong>of</strong> genes that create the linkagerelationships for gene regulatory networks which controlthe temporal and spatial parameters <strong>of</strong> gene expression aswell as their amplitude. Proteins may act alone, in loosefunctional relationships or biomodules (e.g., the enzymes<strong>of</strong> sugar metabolism), in complex protein machines (e.g.,the ribosome), or in larger protein networks. Transcriptionfactors, co-transcription factors, and the factors thatmediate changes in chromatin structure all operate viaDNA-binding sites in cis control regions <strong>of</strong> genes to regulategene expression. <strong>The</strong> protein and gene regulatorynetworks, although conceptually distinct, obviously arefunctionally integrated with one another. Third, it shouldbe stressed that biological information is <strong>of</strong> many differenttypes, starting with the core DNA genomic informationand moving out to ecologies (DNA → RNA → protein→ protein structures and biomodules → networks <strong>of</strong>biomodules → cells → organs [tissues] → individual organisms→ populations <strong>of</strong> individual organisms → ecologies),and that environmental signals increasingly modifythe basic digital information as one moves outward in thisinformation hierarchy. <strong>The</strong>refore, to understand systems,one must have the tools for global measurements <strong>of</strong> asmany information types as possible and the ability to integratethese different types <strong>of</strong> information. Finally, biologicalinformation operates across three distinct time dimensions:evolution, development, and physiology.Accordingly, the patterns <strong>of</strong> genomic content and organizationchange across evolution as the patterns <strong>of</strong> gene expressionchange across development or throughout aphysiological response. Indeed, the informational content<strong>of</strong> cells or organisms may be viewed as a series <strong>of</strong> snapshots<strong>of</strong> changing patterns <strong>of</strong> information expression.<strong>The</strong> systems approach may generally be described asfollows:• A biological system is chosen and all preexisting relevantinformation is integrated into a model that may bedescriptive, graphical, or mathematical.• A global analysis <strong>of</strong> the systems elements is carriedout. Generally this begins with a genome sequence.Genes (and their corresponding proteins) and transcriptionfactor-binding sites may be cataloged, predictedcomputationally, or experimentally identified.<strong>The</strong>se data sets are the initial building blocks for constructingprotein and gene regulatory networks.• <strong>The</strong> system is then perturbed genetically or environmentally,and global data sets are collected from asmany different data types as possible. Genetic perturbationsinclude overexpression, underexpression, orknockouts. <strong>The</strong>se data sets are generally collected understeady-state conditions. Environmental perturbationsmay include, for example, the introduction <strong>of</strong>substrates that activate metabolic pathways, and hormonesthat trigger signal transduction pathways. Here,kinetic experiments are required so that the patterns <strong>of</strong>information change across development or physiologicaltriggering can be characterized.• <strong>The</strong> different data types must be integrated (see examplesbelow) and then compared against the initialmodel. Discrepancies between data and model willemerge. Hypothesis-driven formulations must explain*<strong>The</strong>se authors contributed equally to this paper.Cold Spring Harbor Symposia on Quantitative Biology, Volume LXVIII. © 2003 Cold Spring Harbor Laboratory Press 0-87969-709-1/04. 345