04.11.2014 Views

trans

trans

trans

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

16 PARASITE GENOMICS<br />

clustering by identity, and some BLAST-based<br />

analysis. More complete analyses require an<br />

underlying genomic database, often customized<br />

to deal with the ‘fragmented’ genome<br />

data resulting from ESTs.<br />

Analysis of genome sequence<br />

Genome annotation is an art that is evolving<br />

into a science. The basis of genome annotation<br />

is a reference sequence housed in a database.<br />

This database can then be filled with annotation<br />

‘objects’ that refer to coordinates of the<br />

sequence (and to each other). Thus each parasite<br />

genome project, as it has matured, has<br />

developed a genome database. These range<br />

from simple, hypertext mark-up language<br />

(html) websites to full relational databases with<br />

interactive graphical viewers.<br />

Annotations can be thought of as ‘hard’<br />

and ‘soft’. Hard annotation, given a completed<br />

sequence, refers to objects that rely only on<br />

the primary DNA sequence such as possible<br />

open reading frames, splice sites, and local<br />

repeats. Soft annotation refers to comments<br />

on the sequence and its encoded proteins<br />

based on their relative similarity to other<br />

objects annotated in other genomes. These<br />

soft objects include such things as functional<br />

identification of peptides based on BLAST<br />

similarity to a protein of known function, or<br />

decoration of DNA sequence with promoter<br />

motifs derived from probabilistic searches.<br />

With complete genomes to hand, whole<br />

genome analyses can be carried out to infer the<br />

presence of repetitive DNA segments of various<br />

types, and to perform within-genome<br />

classification of encoded peptides into protein<br />

families. These sorts of analyses are important<br />

in defining the overall structure of the genome,<br />

and in identifying novel features of its encoded<br />

genes.<br />

The annotation snowball<br />

The annotation of a sequence as having a particular<br />

property should be qualified by the relative<br />

goodness-of-fit of that sequence to the<br />

features of the property. For assigning function,<br />

sequence similarity to a protein of known function<br />

is often used. There is a hidden, but known,<br />

danger in this process: what if the functional<br />

assignment of the protein of ‘known’ function<br />

was also based on similarity to another protein?<br />

And what if the chain of functional assignment<br />

is long (many steps before the original biologically<br />

verified functional assignment is<br />

accessed), or one of the assignations is wrong?<br />

For example, many proteins have been functionally<br />

characterized through their roles in<br />

model organisms defined by genetic studies.<br />

If the parasite lacks the structures or biochemistry<br />

displayed by the model organism, the<br />

‘function’ may not exist in the parasite, even<br />

though its proteins are annotated with the<br />

function. One classic example, of relevance to<br />

parasites, is that of the DoxA2 ‘phenoloxidase’<br />

of Drosophila melanogaster. This gene was<br />

identified by mutation as controlling tanning of<br />

the insect cuticle, and knockouts lacked active<br />

phenoloxidase. Such phenoloxidase activity<br />

could underlie the crosslinking and tanning of<br />

nematode cuticles and eggshells, and of platyhelminth<br />

eggshells. The DoxA2 phenoloxidase<br />

has homologues in many other genomes,<br />

including nematodes. However, there are also<br />

homologues in protozoan parasites (such as<br />

P. falciparum and T. brucei) not expected to<br />

encode such activity. In addition, plant homologues<br />

appear to locate to a perinuclear site.<br />

DoxA2 is in fact a regulatory component of the<br />

26S proteasome, which in D. melanogaster is<br />

involved in targeting the pre-pro-phenoloxidase<br />

protein for processing: mutants in DoxA2 are<br />

indeed negative for phenoloxidase, but not<br />

because the phenoloxidase gene is disrupted.<br />

MOLECULAR BIOLOGY

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!