02.05.2015 Views

Genome-Enabled Insights into Legume Biology - University of ...

Genome-Enabled Insights into Legume Biology - University of ...

Genome-Enabled Insights into Legume Biology - University of ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Annu. Rev. Plant Biol. 2012.63:283-305. Downloaded from www.annualreviews.org<br />

by <strong>University</strong> <strong>of</strong> Minnesota - Twin Cities - Wilson Library on 05/07/12. For personal use only.<br />

accessions. This had been suggested in previous<br />

genetic experiments that found biased segregation<br />

ratios involving crosses with A17 (43), but<br />

the sequencing project was able to pinpoint two<br />

breakpoints on chromosomes 4 and 8 to regions<br />

roughly the size <strong>of</strong> BAC clones.<br />

The Lj genome was published in 2008 (79)<br />

and was actually the first legume genome to<br />

appear, though it is still the most incomplete.<br />

As in Mt, the strategy was to focus on gene-rich<br />

portions <strong>of</strong> the genome through the sequencing<br />

<strong>of</strong> large insert clones (in this case, so-called<br />

transformation-competent artificial chromosomes).<br />

The published Lj genome sequence is<br />

315 Mb in length, corresponding to 67% <strong>of</strong><br />

the Lj genome (472 Mb), but only 130 Mb is<br />

high quality and anchored to chromosomes. A<br />

more recent version <strong>of</strong> the Lj genome sequence<br />

is now available through the Web site <strong>of</strong><br />

the lead sequencing group in Kazuza, Japan<br />

(ftp://ftp.kazusa.or.jp/pub/lotus/lotus_r2.5/<br />

pseudomolecule), and it provides a much<br />

more robust platform for Lj genomics. This<br />

updated version (Lj 2.5) contains anchored<br />

pseudomolecules 268 Mb in length throughout<br />

the euchromatic portion <strong>of</strong> Lj plus 33 Mb <strong>of</strong><br />

sequence as yet unanchored.<br />

What Can We Learn from Sequenced<br />

<strong>Legume</strong> <strong>Genome</strong>s?<br />

What have we learned about legume genomes<br />

from this first generation <strong>of</strong> sequencing<br />

projects? In the broadest sense, sequenced<br />

legume genomes look very much like those<br />

<strong>of</strong> other dicots, though comparisons with<br />

Arabidopsis can be complicated by its unusually<br />

small genome size and complex duplication<br />

history (3). A closer look at the Gm genome<br />

finds that ∼57% <strong>of</strong> the overall sequence<br />

is found in repeat-rich, low-recombination<br />

heterochromatin, while most genes (78%) are<br />

found in euchromatic chromosome arms (81).<br />

Of course, this also implies that substantial<br />

numbers <strong>of</strong> Gm genes (22%) lie within the<br />

pericentromeric heterochromatin, a somewhat<br />

surprising and potentially important result. As<br />

expected, crossovers are pr<strong>of</strong>oundly reduced<br />

near centromeres, with the ratio <strong>of</strong> genetic<br />

to physical distance dropping by 27-fold<br />

between the euchromatic and pericentromeric<br />

portions <strong>of</strong> the genome. <strong>Genome</strong> organization<br />

in Mt seems largely comparable, though the<br />

evidence for this is based on a combination <strong>of</strong><br />

the BAC-based euchromatin sequence, FISH<br />

microscopy, and optical mapping (100). Notably,<br />

the estimated proportion <strong>of</strong> the genome<br />

located in pericentromeres is much lower in<br />

Mt compared with Gm (∼22% versus ∼57%),<br />

something that presumably plays a role in the<br />

difference in genome size. In both Gm and<br />

Mt, gene density is generally high throughout<br />

euchromatic arms, with only limited indications<br />

<strong>of</strong> a gene density gradient rising from<br />

centromere to telomere. In Mt, for example,<br />

the gene density is estimated at 16.9 per 100 kb<br />

(1 gene every 5.9 kb) throughout the euchromatin,<br />

with the average gene being 2,211 bp in<br />

length and containing four introns. By way <strong>of</strong><br />

comparison, Mt values are similar to those in<br />

Arabidopsis (2,174 bp) and Oryza (3,403 bp).<br />

Altogether, the Gm genome is reported to<br />

have 46,430 “high-confidence” protein-coding<br />

loci, which represents a culled set <strong>of</strong> gene models<br />

from an original set that included ∼20,000<br />

predicted with lower confidence (81). In Mt,<br />

a total <strong>of</strong> 62,152 genes were annotated, a value<br />

that drops to 47,845 when retaining only those<br />

genes with experimental or database support.<br />

The similarity in gene counts between the two<br />

systems is surprising and significant, because<br />

the lineage leading to present-day soybean is<br />

known to have undergone a whole-genome<br />

duplication (WGD) at 13 Mya or later, a<br />

duplication that is absent in the Mt lineage<br />

(there is much more about this important<br />

evolutionary event below). Thus, one might<br />

have expected higher gene numbers in Gm<br />

compared with Mt. TheGm genome is also<br />

reported to have 313,125 retrotransposons and<br />

294,937 DNA transposons (spanning 403 Mb<br />

and 157 Mb, respectively), whereas the Mt<br />

genome has 253,048 retrotransposons and<br />

34,529 DNA transposons (spanning 88 Mb<br />

286 Young·Bharti

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!