13.07.2015 Views

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

GENETIC CONTROL OF TRANSCRIPTION 113nism; a variant TF, for example, might influence the expression<strong>of</strong> all genes within the regulon. <strong>The</strong>se can be detectedin our experiments if we observe an SDP in thetrans analysis which appears to be associated with variationin the levels <strong>of</strong> mRNA in multiple genes that do notmap to this SDP.We identified two SDPs that appeared to be influencing12 different genes. <strong>The</strong> target genes <strong>of</strong> one such case includeSlc25a1 (solute carrier family 25 member 1), Xrn1(5´-3´ exoribonuclease 1), Xpnpep1 (X-prolyl aminopeptidaseP1, soluble), Ctps2 (cytidine 5´-triphosphate synthase2), BG065446 (homology with RNA polymerase1–3 16-kD subunit), C77518 (homology with PTPL-1-associatedRho-GAP), D7Ertd59e (an expressed marker), aswell as five proteins <strong>of</strong> unknown function. Our workinghypothesis is that the shared SDP defines a region thatcontains a variant transcription effector that is modulatingthe behavior <strong>of</strong> SETS <strong>of</strong> genes that comprise a regulon (aset <strong>of</strong> coregulated genes). <strong>The</strong> expectation, given that ourdata set contains 412 matches, is that any one SDP will appear412/800 times: Thus, a single SDP appearing 12times is unusual. <strong>The</strong> SDPs define a region <strong>of</strong> chromosome14 that spans 7 Mb. Using the UCSC genomebrowser, we have identified two potential transcriptionfactors within the region, Dnase1l3 and 2610511E03Rik(Rnase P-related). 2610511E03Rik contains one replacementvariation (Met132Val) in C57BL/6 compared toDBA/2J in the Celera database: <strong>The</strong> other protein containsonly silent/noncoding variants. This leads us to the workinghypothesis that variation in 2610511E03Rik is thecause <strong>of</strong> variation in the target genes. In addition, we haveidentified one case <strong>of</strong> an SDP influencing 9 genes, an SDPinfluencing 7 genes, two cases <strong>of</strong> an SDP influencing 6genes, three cases <strong>of</strong> an SDP influencing 5 genes, fourteencases <strong>of</strong> an SDP influencing 3 genes, and one hundred andtwelve cases <strong>of</strong> an SDP influencing 2 genes, and these arebeing analyzed intensively. <strong>The</strong> existence <strong>of</strong> identifiableregulons partially explains the frequency <strong>of</strong> trans influencediscussed above. We believe these preliminary dataare graphic illustrations <strong>of</strong> the potential <strong>of</strong> our analytic andexperimental analyses.CONCLUSIONWe have developed a powerful system for analyzinggenetic variation and its influence on mRNA levels. Ourapproach is readily comparable to that <strong>of</strong> Schadt et al.(2003), who used a backcross between C57BL/6 andDBA/2J to analyze mRNA level variation. Such animalsare either homozygous or heterozygous for variations, incontrast to RI strains which are homozygous, and thismay account for the greater variability seen in mRNAlevels in their analyses, where 33% <strong>of</strong> genes appeared tobe differentially expressed within the progeny. Furtheranalysis will resolve this issue.An important distinction between the use <strong>of</strong> RI orbackcross mice is that the RI lines are genetically stableand can be bred at will. We believe that ability to tailorgenotypes by selection <strong>of</strong> RI lines and other strains reinforces,once again, the great power <strong>of</strong> the mouse as a geneticmodel for human variation, because establishing therelationship <strong>of</strong> quantitative mRNA variation to ultimatephenotype is not simple. <strong>The</strong> parental C57BL/6 andDBA/2J mice differ significantly in many physical, biochemical,and behavioral respects (Festig 1998), andthese data in principle can be related to underlying geneticvariations. In practice, until we have a better idea <strong>of</strong>the specific effectors and genes, it is difficult to definetestable hypotheses.It is possible to extend the approach we have developedhere to humans. Unlike RI lines, humans are frequentlyheterozygous for variations, outbred, susceptible to environmentalinfluence, and not a ready source <strong>of</strong> tissue. Despitethese reservations, we believe it will be possible tocarry out preliminary experiments on tissue mRNAs isolatedfrom extended human families; specifically, thethree-generation “reference” CEPH families that havebeen very extensively typed using microsatellite markersas part <strong>of</strong> the Human <strong>Genom</strong>e Project. <strong>The</strong>se data, publiclyavailable at http://lpg.nci.nih.gov/CHLC/, enable usto calculate the parental origin <strong>of</strong> any genomic region,which in turn enables us to construct the equivalent <strong>of</strong> anSDP. This will be the “parent <strong>of</strong> origin distribution pattern”or PODP <strong>of</strong> the gene in the family. Analogous to ourmice experiments, concordance or discordance <strong>of</strong> expressionlevels with the PODP will indicate cis or trans influenceon expression levels.Confounding this experiment are numerous nongeneticfactors associated with the intrinsic variability <strong>of</strong> transformedlymphoblastoid cell lines, but these influences arenot expected to be identically distributed to Mendelianpatterns <strong>of</strong> inheritance, and frank genetic signal should inprinciple be isolable by the analysis we have proposed.Our observation <strong>of</strong> a significant amount <strong>of</strong> variationwithin the machinery that controls transcription, alliedwith our preliminary data and that <strong>of</strong> Schadt et al. (2003),leads us to propose a new class <strong>of</strong> project. We suggestthat identifying sequence variation in this machinery, inany organism, will provide more significant insights intothe molecular basis <strong>of</strong> phenotypic variation than conventionalcandidate gene approaches based on more limitedphysiological function.ACKNOWLEDGMENTSThis work was carried out with the support <strong>of</strong> a start-upgrant from the University <strong>of</strong> New South Wales and withthe aid <strong>of</strong> Australian Postgraduate Award scholarships toE.C. and M.K. We are indebted to the staff <strong>of</strong> the Cliveand Vera Ramaciotti Centre for Gene Function Analysisfor provision <strong>of</strong> mouse microarrays and to Matt Wand,Willam Dunsmuir, and David Nott (School <strong>of</strong> Mathematicsat UNSW) for a continuing collaboration on statisticalanalysis <strong>of</strong> our data.REFERENCESBennett S.T., Lucassen A.M., Gough S.C., Powell E.E.,Undlien D.E., Pritchard L.E., Merriman M.E., KawaguchiY., Dronsfield M.J., and Pociot F. 1995. Susceptibility to humantype 1 diabetes at IDDM2 is determined by tandem re-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!