12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6 <strong>Function</strong> Diversity Within Folds and Superfamilies 153By definition, a duplication event gives rise <strong>to</strong> homologous genes. But furtherdistinctions can be made. Genes that descend from a common ances<strong>to</strong>r gene viaduplication <strong>with</strong>in a given genome and in the absence of an accompanying speciationprocess are known as paralogues. Genes that descend from a common ances<strong>to</strong>rgene via duplication of the genome itself during speciation are known as orthologues.It is generally assumed that orthologous genes tend <strong>to</strong> preserve the functionof the ances<strong>to</strong>r gene, due <strong>to</strong> a strong selective pressure <strong>to</strong> ensure that the ancestralfunction is still performed in both descendant species (Tatusov et al. 1997). Basedon this assumption, some authors even define orthologues as homologues that havethe same function in different species. Several databases have been set up <strong>to</strong> defineorthologous genes from different sets of organisms (Dolinski and Botstein 2007).On the contrary, the presence of multiple copies of a given gene <strong>with</strong>in a genome,i.e. paralogues, could arguably often result in one of the copies being under strongselective pressure <strong>to</strong> maintain the original function thus allowing more freedom fordivergence for the other copies. The process by which one copy of a duplicatedgene conserves the function of the ances<strong>to</strong>r gene, whereas the other copies evolvealternative functions is known as neofunctionalisation. In the absence of selectivepressure on these additional copies, a frequent outcome of evolutionary divergenceis the loss of some of them in<strong>to</strong> pseudo-genes, which are gene relics no longerexpressed (Harrison and Gerstein 2002). This evolutionary process is callednonfunctionalisation. Subfunctionalisation is a third evolutionary process whichrefers <strong>to</strong> cases where multiple functions of an ancestral gene are divided betweenthe paralogues. In any case, paralogues are often considered <strong>to</strong> be more functionallydiverse than orthologues because of their larger freedom <strong>to</strong> diverge.In whatever order they occurred, the subsequent events of duplication in<strong>to</strong> orthologuesor paralogues that <strong>to</strong>ok place during biological his<strong>to</strong>ry have resulted in thecurrent protein superfamilies. Not all superfamilies seem <strong>to</strong> have been equally successfulin this process, as some of them are known <strong>to</strong> account for disproportionatelylarge numbers of genes in fully sequenced genomes (Marsden et al. 2006). To date,reasons for evolutionary success disparity of the different superfamilies are not clear,and arguments relating <strong>to</strong> structural and functional properties, or evolutionary dynamicshave been proposed (Goldstein 2008). It can be expected that older superfamilies,having had more time <strong>to</strong> diverge and explore different functions, should generally bemore extended in present time. For example, the HUP superfamily (CATH code3.40.50.620), which on account of phylogenetic considerations is believed <strong>to</strong> traceback <strong>to</strong> the RNA world, displays a very wide array of seemingly unrelated functions(Aravind et al. 2002); whereas several recent superfamilies that are observed exclusivelyin eukaryotic species are often restricted <strong>to</strong> very specific sets of functions.However, the age doesn’t seem <strong>to</strong> be the main fac<strong>to</strong>r explaining the varying sizes ofsuperfamilies. In a recent analysis of evolutionary dynamics of gene families thatcontain genes <strong>with</strong> essential functions (termed E-families) and gene families that donot contain such genes (termed N-families), it was proposed that paralogues in E-families are more likely <strong>to</strong> evolve new functions than those in N-families thus suggestingthat the function of ancestral genes in a family is a key determinant of itsevolutionary success (Shakhnovich and Koonin 2006). As will be shown in the nextsection, other arguments <strong>to</strong> explain the variable success of protein superfamilies mayderive from the mechanisms that have been proposed <strong>to</strong> explain function evolution.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!