01.04.2015 Views

Gene Cloning

Gene Cloning

Gene Cloning

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

224 <strong>Gene</strong> <strong>Cloning</strong><br />

well as the statistical test. The example used in the alignment in Figure 8.11<br />

is explained in the next section and you will be able to see that the alignment<br />

fits very well with what is known about the two proteins.<br />

8.8 What Can Alignments Tell Us About the Biology of the<br />

Sequences Being Compared?<br />

Pair-wise comparison of two related sequences can be a very powerful tool.<br />

The following example illustrates the sort of information that can be<br />

deduced from this type of comparison. It involves the comparison of a protein<br />

from different strains of a human adenovirus. Some types of this virus<br />

are oncogenic, capable of causing tumors to develop, while others are not.<br />

The protein E1A is the product of the virus “early region” and has been<br />

shown to be involved in the induction of tumors. The dot-plot in Figure 8.8<br />

and the alignment in Figure 8.11 show a comparison of the amino acid<br />

sequences of the E1A protein from an oncogenic strain (12) with that of a<br />

non-oncogenic strain (05). The strong diagonal line on the dot-plot (Figure<br />

8.8) indicates that these sequences are very similar along large parts of their<br />

length, which is to be expected since they are variations of the same protein;<br />

you can see these regions of similarity on the alignment (Figure 8.11).<br />

There are two large gaps which have to be introduced into the alignment,<br />

marked with arrows in Figure 8.8. It is the extra sequence, starting at position<br />

123 in the A1E12, which is not present in the non-oncogenic strain<br />

which is responsible for this virus causing tumors. This observation from<br />

sequence alignments has been tested in the laboratory by cloning this extra<br />

sequence into the non-oncogenic strain and observing that it is then capable<br />

of causing tumors in mice.<br />

8.9 Similarity Searches<br />

Imagine that you could do a pair-wise alignment with your query sequence<br />

and every known sequence and then look at, say, the best 50 alignments.<br />

This would tell you which of the proteins, for which sequence data was<br />

available, were most similar to your query. If the structures and functions<br />

of these proteins are known then they will give clues as to the structure and<br />

function of your query sequence. DNA and protein sequence data are<br />

stored in large publicly available databases (Box 8.2); in addition to the<br />

sequence data these databases contain varying amounts of annotation. For<br />

example, a typical entry in the protein database, SwissProt, gives a brief<br />

description of the protein and its origin, references to papers describing<br />

the derivation and analysis of the sequence, a comments section which will<br />

identify any key features of the protein such as its activity and whether it<br />

belongs to a characterized family of proteins, hyperlinks to relevant entries<br />

in other databases and an outline of the key features of the sequence.<br />

Hence searching this type of database provides access to an immense<br />

amount of information.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!