13.07.2015 Views

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Annotation <strong>of</strong> Novel Proteins Utilizing A Functional<strong>Genom</strong>e Shotgun Coupled with High-ThroughputProtein Interaction MappingJ.A. MALEK,* J.M. WIERZBOWSKI,* G.A. DASCH, † M.E. EREMEVA, ‡P.J. MCEWAN,* AND K.J. MCKERNAN**Agencourt Bioscience Corporation, Beverly, Massachusetts 01915; † Centers for Disease Control and Prevention,Atlanta, Georgia 30332; and ‡ University <strong>of</strong> Maryland, Baltimore, Maryland 21201It is quoted frequently that the amount <strong>of</strong> genomic sequencedata is increasing at a tremendous rate while functionalmethods for studying proteins have not kept pace.Many groups have attempted to address this issue by use<strong>of</strong> microarrays, yeast two-hybrid screens, and proteincomplex purification with subsequent identification. <strong>The</strong>underlying theme <strong>of</strong> these approaches is their use <strong>of</strong>“guilt-by-association” (Oliver 2000) methods for annotation<strong>of</strong> proteins <strong>of</strong> unknown function. <strong>The</strong> association <strong>of</strong>a protein <strong>of</strong> unknown function with proteins <strong>of</strong> knownfunction is used to derive a potential function for the protein<strong>of</strong> unknown function. Although large-scale microarrayexperiments have increased dramatically and are carriedout in numerous large and small laboratories,proteome-wide two-hybrid experiments have only beencarried out on yeast (Uetz et al. 2000; Ito et al. 2001) withsome large studies in Caenorhabditis elegans (Walhoutet al. 2000), and Helicobacter pylori (Rain et al. 2001),among others. Numerous review papers have been writtenon these large-scale two-hybrid studies, analyzing thedata, testing the data’s validity, and using the data to trainin silico protein interaction prediction s<strong>of</strong>tware. <strong>The</strong> needfor further, validated, protein interaction information isclear. Among the challenges in generating proteomewideinteraction data are the lack <strong>of</strong> fully automated processes,the sheer amount <strong>of</strong> screening necessary to completeone map for one organism, and an incomplete grasp<strong>of</strong> what constitutes a true, physiologically important proteininteraction. It is our belief that using comparative interactiondata will allow deciphering <strong>of</strong> what interactionsare physiologically valid. Validation <strong>of</strong> interactions hascentered around comparisons to databases <strong>of</strong> individuallyobtained and presumably more verified interactions, thepresence <strong>of</strong> interactions among proteins with similar expressionpr<strong>of</strong>iles, and the frequency <strong>of</strong> interactionsamong proteins sharing similar biological processesand/or cellular compartments (Deane et al. 2002). Althoughthese methods <strong>of</strong> verification may add a level <strong>of</strong>significance to any interaction, their absence should notper se be used to subtract from an interaction’s validity.Observing similar expression pr<strong>of</strong>iles between two proteinsmay suggest they are functionally related but doesnot mean that they physically interact. Observing interactionsamong proteins <strong>of</strong> different biological processesmay reveal a gap in our knowledge more than an incorrectinteraction. Observation <strong>of</strong> interactions among proteinsfrom different cellular compartments is less meaningfulin organelle-free microbes. Physiologically significantinteractions with a wide range <strong>of</strong> strengths have been observed,therefore a significance cut<strong>of</strong>f based on interactionstrength cannot be set at present.Automated DNA sequencing technology was heavilydeveloped during the Human <strong>Genom</strong>e Project into a robustand cheap process. It would be <strong>of</strong> benefit to use thesedevelopments in advancing proteome-wide interactiondata. We have attempted to improve the ease with whichsuch studies can be carried out by adopting a strategy thatrelies on whole-genome shotgun sequencing and a bacterialtwo-hybrid system (Dove et al. 1997; Shaywitz et al.2000). <strong>The</strong> whole-genome shotgun method generatescloned overlapping fragments <strong>of</strong> genomic DNA which, ifcloned in the proper orientation and frame, can be expressedas a protein. <strong>The</strong> use <strong>of</strong> peptide fragments ratherthan full-length proteins has been shown to reduce falsenegatives (Ward et al. 2002) while <strong>of</strong>fering the opportunityto localize the domain <strong>of</strong> a protein responsible for aninteraction. Use <strong>of</strong> the bacterial two-hybrid system allowsintegration into standard sequencing pipelines. <strong>The</strong> twovectors used in the system are standard sequencing vectorsthat are transformed together into an essentially standardcloning strain <strong>of</strong> Escherichia coli. <strong>The</strong> system, similarto various yeast two-hybrid systems, relies onrecruitment <strong>of</strong> transcriptional machinery to promoters upstream<strong>of</strong> reporter genes. Briefly, a protein <strong>of</strong> interest isfused to the λcI protein which binds a λ operator on thereporter construct (Fig. 1a). A second protein <strong>of</strong> interestis fused to the RNA polymerase α-subunit. An interactionbetween the proteins <strong>of</strong> interest stabilizes the transcriptionalmachinery at a weak promoter upstream <strong>of</strong> the reporterconstruct (Fig. 1b). Interactions are observed as acolony able to grow in the presence <strong>of</strong> an antibiotic andthe absence <strong>of</strong> any carbon source other than lactose.Colonies can enter a standard sequencing pipeline at thispoint through the automated colony pickers. Sequencing<strong>of</strong> the bait <strong>of</strong> prey fragment is conducted with primersspecific for either vector.Cold Spring Harbor Symposia on Quantitative Biology, Volume LXVIII. © 2003 Cold Spring Harbor Laboratory Press 0-87969-709-1/04. 331

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!