09.07.2015 Views

Ontology engineering

Ontology engineering

Ontology engineering

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

news and views© 2010 Nature America, Inc. All rights reserved.TwiTinMef2BinBapChIPTranscription factorTwiTinMef2BinBapExtract occupancy profileof candidate module5−78−9Stage10−1112−1314−15Predict expressionof candidate modulesProfilesSupport vector machineM SM VMIn vivo validationCandidate cis-regulatory modulesExtract occupancy profilesof known modulesExpression patternsMesoderm (M)Somatic muscle (SM)Visceral muscle (VM)Mesoderm and somaticmuscle (MSM)Mesoderm and visceralmuscle (MVM)OthersMSM MVM OthersTemporal stageTrain classifierFigure 1 Pipeline for discovery of cis-regulatorymodules involved in mesoderm specification.ChIP-chip assays provide genome-wide occupancyinformation for each of five relevant transcriptionfactors at five different temporal stages ofembryonic development. Clusters of ChIP peaksare designated as candidate cis-regulatorymodules. Transcription factor occupancy profilesare generated for each candidate module (left).The same ChIP-chip data are used to generateoccupancy profiles of previously identified cisregulatorymodules (right). These profiles, togetherwith experimentally determined expressionpatterns driven by each module, which arecurated from the literature, are used to train asupport vector machine classifier. The classifieris used to predict the expression pattern (visceralmuscle in this example) driven by the candidatecis-regulatory module. The prediction is verifiedin vivo by a transgenic reporter assay. Reporterresults reprinted from ref.2, with permission of theauthors.expression patterns are easier to come by 8 ;thus, adapting the authors’ approach to workwith gene, rather than module, expressionpatterns as training data would go a long waytoward ensuring broader application.The new method may also be useful in syntheticbiology. Whether for ab initio design ofa sequence that drives a desired tissue-specificpattern 9 or for the refinement of an existingsequence to be used in a synthetic circuit 10 , theutility of quantitative models of expression iswell recognized. The working model proposedhere could help to identify several endogenoussequences with the same regulatory functionand could even suggest the variants (by specifyingtargets of mutation) that are best suitedfor the specific <strong>engineering</strong> goal.As genome-wide assays of transcriptionfactor–DNA binding become more common,tools that interpret the resulting data toelucidate combinatorial gene regulation willbe needed. The study by Zinzen et al. 2 offersan innovative approach to building such toolsand sets the stage for more in-depth explorationsof regulatory networks.COMPETING INTERESTS STATEMENTThe authors declare no competing financial interests.to involve whole-genome assays of chromatinstate, such as nucleosome occupancy or varioushistone modifications 4 .There are some practical considerations inapplying the proposed strategy more broadly.First, the method relies on prior knowledgeof all relevant transcription factors, which inthe case of mesoderm specification was availablefrom extensive prior work. For studies ofother regulatory networks, this requirementmight be mitigated using existing statisticaltechniques 7 that identify binding sites overrepresentedin known cis-regulatory modulesof the network, thus inferring the relevanttranscription factors. Second, the model has a‘training phase’ that requires expression measurementson a large number of cis-regulatorymodules—the authors used 139 moduleswith previously characterized expression inmesoderm and/or muscle. Such data are notavailable for most regulatory systems and aredifficult to generate. On the other hand, gene1. Davidson, E.H. The Regulatory Genome: Gene RegulatoryNetworks in Development and Evolution (Academic Press,2006).2. Zinzen, R.P., Girardot, C., Gagneur, J., Braun, M. &Furlong, E.E. Nature 462, 65–70 (2009).3. Janssens, H. et al. Nat. Genet. 38, 1159–1165 (2006).4. Segal, E. & Widom, J. Nat. Rev. Genet. 10, 443–456(2009).5. Beer, M.A. & Tavazoie, S. Cell 117, 185–198 (2004).6. Arnosti, D.N. & Kulkarni, M.M. J. Cell. Biochem. 94,890–898 (2005).7. Warner, J.B. et al. Nat. Methods 5, 347–353 (2008).8. Tomancak, P. et al. Genome Biol. 8, R145 (2007).9. Venter, M. Trends Plant Sci. 12, 118–124 (2007).10. Haseltine, E.L. & Arnold, F.H. Annu. Rev. Biophys. Biomol.Struct. 36, 1–19 (2007).nature biotechnology volume 28 number 2 february 2010 143

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!