13.07.2015 Views

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

The Genom of Homo sapiens.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

MOUSE GENOME ENCYCLOPEDIA PROJECT 201One remarkable finding <strong>of</strong> the FANTOM2 research isthat the genome <strong>of</strong> higher organisms, including humansand mice, can encode much more information than suggestedby the number <strong>of</strong> TUs (genes). To control suchenormously complex biological systems, dynamic variationis employed at the transcriptional level, since genomicinformation itself is very static. In our database, wefound an unexpectedly large number <strong>of</strong> 5´ and 3´ endvariations and alternatively spliced sequences. Variationat the 5´ end is mainly due to differences in gene promoters,which are regulated in a tissue- and stage-specificmanner. 3´-end variation is so common that it is also verylikely to be functional. When classified by 3´-end sequence,our original 1,916,592 clones fell into 188,000clusters. When analyzed in terms <strong>of</strong> complexity, these188,000 3´-end variations are equivalent to ~60,000 TUs.Each TU therefore has more than three 3´-end variationson average. In the 3´-end untranslated region (UTR),many consensus sequences with clear functions areknown. One good example is the small stretch “UUAU-UUAUU.” When included in a 3´-end variation, this motifappears to be the target attacked by endonucleases.mRNAs with the “UUAUUUAUU” motif are degradedvery rapidly, and the half-life <strong>of</strong> such transcripts is veryshort, whereas mRNAs without this motif have a longhalf-life. 3´-end variation, therefore, is clearly functional.Living cells control the selection and ordering <strong>of</strong> exonsusing tissue- and stage-specific splicing machinery. Ourclones were collected using normalization and subtractionmethods to cover a greater variety <strong>of</strong> TUs. If our subtractionsystem had worked perfectly, we would not havecollected any alternatively spliced transcripts. However,the system is leaky; given the large-scale collection <strong>of</strong> FLcDNA, plenty <strong>of</strong> alternatively spliced transcripts wereisolated in FANTOM2 analysis. Surprisingly, more than41% <strong>of</strong> all TUs encode alternatively spliced forms, and79% <strong>of</strong> alternatively spliced TUs alter the amino acid sequences<strong>of</strong> CDSs. Thus, the number <strong>of</strong> transcripts andproteins is much larger than the number <strong>of</strong> TUs (genes).This mechanism allows living cells to expand their genomicinformation to finely control their life processes.THE DISTRIBUTION SYSTEM FOR THE POST-GENOME RESOURCE (DNABOOK)<strong>The</strong> Riken Mouse <strong>Genom</strong>e Encyclopedia Project producedtwo major resources: the transcriptome databaseand the mouse FL cDNA clone bank. <strong>The</strong>se two resourcesare key platforms for use in post-genome andpost-transcriptome life science. <strong>The</strong> distribution <strong>of</strong> theseresources is therefore very important in facilitating researchin the 21st century. <strong>The</strong> FANTOM2 database waspublished on the Internet on December 5, 2002. However,at that point, the clone bank <strong>of</strong> 60,770 FL cDNAclones still had to be shipped in a box with 100 kg <strong>of</strong> dryice, a very inconvenient and tedious process.We have solved this problem by developing a newtechnology for the distribution <strong>of</strong> FL cDNA clones(Hayashizaki and Kawai 2003; Hayashizaki et al. 2003;Kawai and Hayashizaki 2003). <strong>The</strong> DNA is printed ontoFigure 7. <strong>The</strong> cover design represents full-length cDNA analyzedby RIKEN Integrated Sequence Analyzer (RISA) andFANTOM computer system. Designed by Dr. Chie Owa.paper sheets and may be shipped as a bound “DNA-Book.” This idea was first conceived at the beginning <strong>of</strong>this project. Figure 7 shows the cover <strong>of</strong> the first edition<strong>of</strong> the Riken Mouse Encyclopedia DNABook and has beenused from the start <strong>of</strong> this project as the title slide in ourpresentations. This design illustrates the following: All <strong>of</strong>the FL cDNAs were fed into the RISA system, many sequenceshave come out <strong>of</strong> RISA, and the database wassubsequently established on computer hard disk. An image<strong>of</strong> the hard-copy encyclopedia “book” is also includedin this figure.We tested the performance <strong>of</strong> the DNA printed sheetunder various conditions <strong>of</strong> temperature, humidity, pressure,and other environmental factors, and its resistanceto scratching or touching. Under normal conditions, webelieve it possible for cDNA clones to be maintained onthe paper sheet for at least several years.<strong>The</strong> DNABook has the potential to significantly improvethe processes <strong>of</strong> storing and shipping genome resources.Traditionally, a large dry-ice box for shippingand a large freezer at –80ºC for storage are required.Shipping and handling times are typically one or twoweeks. <strong>The</strong> DNABook, however, can be shipped as easilyas printed material. Once a user has the encyclopediaDNABook, clones may be obtained by simply punchingout the relevant spots and subjecting them to PCR. Clonesare available for experiments within two hours. One bookcan replace a freezer full <strong>of</strong> clones.In the pre-genome era (before about 1995), we man-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!