09.12.2012 Views

Principles of Plant Genetics and Breeding

Principles of Plant Genetics and Breeding

Principles of Plant Genetics and Breeding

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Analysis<br />

BIOTECHNOLOGY IN PLANT BREEDING 241<br />

Two evolutionary concepts underlie the inferences from analyses <strong>of</strong> multiple sequence alignments. The first is that the mutation <strong>of</strong><br />

one sequence residue to another is a r<strong>and</strong>om event in nature <strong>and</strong> that all residues are more or less equally subject to mutation. The<br />

second is that the conservation <strong>of</strong> specific residues is maintained through evolutionary selection; that is, mutations that adversely<br />

affect the ability <strong>of</strong> a molecule to carry out its biochemical activity or physiological role will be eliminated by reducing the ability<br />

<strong>of</strong> the organism to live. Details <strong>of</strong> the evolutionary history between different pairs <strong>of</strong> sequences can lead to different inferences<br />

about the properties <strong>of</strong> the sequences.<br />

In the evolutionary history <strong>of</strong> some pairs <strong>of</strong> sequences – orthologues or orthologous sequences – the sequences have only speciation<br />

events in their evolutionary history. Other pairs <strong>of</strong> sequences – paralogues or paralogous sequences – have one or more<br />

gene duplication events in their common evolutionary history as well as having speciation events. In general, paralogues will<br />

carry out the same biochemistry on different substrates or with different c<strong>of</strong>actors; while orthologues will carry out the same biochemistry<br />

on the same substrates <strong>and</strong> will <strong>of</strong>ten serve an identical physiological role in the same pathways in different organisms.<br />

Homologues include both orthologues <strong>and</strong> paralogues. Since paralogues carry out the same basic biochemistry, for example<br />

reduce an aldehyde, the residues responsible for this activity are conserved. But, the residues responsible for the physiological<br />

role (e.g., substrate recognition, which specific aldehyde to reduce or lipid to bind) will be under different evolutionary pressures<br />

<strong>and</strong> will <strong>of</strong>ten diverge. A complete analysis <strong>of</strong> the multiple sequence alignment includes identifying residues responsible for the<br />

common, shared properties <strong>of</strong> the entire family <strong>and</strong> the residues that discriminate between paralogous groups (Figure 1).<br />

The results from the analysis <strong>of</strong> multiple sequence alignments is illustrated in the results obtained in the decades’ long quest to<br />

underst<strong>and</strong> how each <strong>of</strong> the 20 aminoacyl tRNA synthetase (aaRS) enzymes map each <strong>of</strong> the 60 tRNAs to its correct cognate<br />

amino acid. The problem, as outlined by Sir Francis Crick in 1957, is where is the information that allows an aaRS to recognize<br />

only the tRNAs encoding the correct amino acid? Experimental groups have employed a number <strong>of</strong> creative <strong>and</strong> innovative<br />

approaches towards answering this question (Söll & Schimmel 1974). McClain, Nicholas, <strong>and</strong> coworkers applied computational<br />

techniques, beginning with creating an accurate multiple sequence alignment <strong>of</strong> the tRNAs.<br />

The tRNA multiple sequence alignment was divided into 20 subsets, each based on their amino acid acceptor activity (McClain<br />

& Nicholas 1987). First these authors identified the sequence positions in the 66 tRNA sequences from Escherichia coli, its bacteriophages,<br />

<strong>and</strong> near relatives that were conserved. They then examined the remaining positions <strong>and</strong> asked the question: what<br />

Bpi_Bovin : VTCSSCSSHINSVHVHISKSK-VGWLIQLFSKKIESALRNKMNSQVCEKVTNSVSSKLQPYFQTLP<br />

Bpi_Human : ITCSSCSSDIADVEVDMSGD--SGWLLNLFHNQIESKFQKVLESRICEMIQKSVSSDLQPYLQTLP<br />

Lbp_Human : GYCLSCSSDIQNVELDIEGD--LEELLNLLQSQIDARLREVLESKICRQIEEAVTAHLQPYLQTLP<br />

Lbp_Rabit : VTTSSCSSRIRDLELHVSGN--VGWLLNLFHNQIESKLQKVLESKICEMIQKSVTSDLQPYLQTLP<br />

Lbp_Rat : VTASGCSNSFHKLLLHLQGEREPGWIKQLFTNFISFTLKLVLKGQICKEI-NVISNIMADFVQTRA<br />

Cetp_Human: TDAPDCYLSFHKLLLHLQGEREPGWIKQLFTNFISFTLKLVLKGQICKEI-NIISNIMADFVQTRA<br />

Cetp_Macfa: TDAPDCYLAFHKLLLHLQGEREPGWLKQLFTNFISFTLKLILKRQVCNEI-NTISNIMADFVQTRA<br />

Cetp_Rabit: TNAPDCYLAFHKLLLHLQGEREPGWLKQLFTNFISFTLKLILKRQVCNEI-NTISNIMADFVQTRA<br />

Pltp_Human: VSNVSCQASVSRMHAAFGGT--FKKVYDFLSTFITSGMRFLLNQQICPVLYHAGTVLLNSLLDYVP<br />

Pltp_Mouse: VSNVSCEASVSKMNMAFGGT--FRRMYNFFSTFITSGMRFLLNQQICPVLYHAGTVLLNSLLDTVP<br />

Lplc3_Rat : LILKRCNT----LLGHISLT--SGLLPTPIFGLVEQTLCKVLPGLLCPVV-DSVLSVVNELLGATL<br />

Lplc4_Rat : LVIERCDT----LLGGIKVKLLRGLLPNLVDNLVNRVLANVLPDLLCPIV-DVVLGLVNDQLGLVD<br />

Consensus C i l C t<br />

Figure 1 A section <strong>of</strong> a multiple sequence alignment <strong>of</strong> 12 sequences from the Bpi/Lbp/Lp1 superfamily <strong>of</strong><br />

lipid-binding protein sequences. The 12 sequences belong to six different paralogous families, each <strong>of</strong> which binds a<br />

different lipid <strong>and</strong> performs a different physiological role, generally a transport function. Each sequence is identified<br />

by a label that identifies the paralogous family (they have a gene duplication event separating the families) before the<br />

underscore character in the protein name <strong>and</strong> the species after the underscore. The protein names are taken from the<br />

Swiss-Prot database. The annotation associated with each sequence in the Swiss-Prot database describes the lipid<br />

bound <strong>and</strong> the physiological role <strong>of</strong> the protein. The sequences within each paralogous family are orthologous<br />

(related to each other only by speciation events). The dash sequence character indicates an alignment column where<br />

either the amino acids replaced by dashes have been lost through a deletion event or amino acids shown by the single<br />

letter codes have been inserted into the sequence during the course <strong>of</strong> evolution. The section marked with a black<br />

background <strong>and</strong> white letters is a moderately well-conserved motif. Such conservation <strong>of</strong>ten marks regions<br />

important to the structure or function <strong>of</strong> the protein. Amino acids highlighted with black letters on a light gray<br />

background are identical within the highlighted family <strong>of</strong> orthologous sequences <strong>and</strong> have quite different chemical<br />

or physical properties to those in the same column in other families. They may mark positions that are responsible<br />

for the differences among the paralogous families.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!