01.04.2015 Views

Gene Cloning

Gene Cloning

Gene Cloning

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

L<br />

G<br />

S<br />

236 <strong>Gene</strong> <strong>Cloning</strong><br />

4<br />

3<br />

Bits<br />

2<br />

1<br />

0<br />

N<br />

MEL<br />

1<br />

2<br />

C C GDSGGP<br />

FI AV<br />

D<br />

3<br />

4<br />

G<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

11<br />

SAS TC<br />

12<br />

13<br />

14<br />

O<br />

15<br />

16<br />

17<br />

18<br />

19<br />

20<br />

21<br />

LV NV<br />

G<br />

22<br />

23<br />

24<br />

weblogo.berkeley.edu<br />

C<br />

Figure 8.15 Sequence logo. This shows the region around the conserved<br />

active site serine of the alignment from Figure 8.14. A sequence logo is a<br />

graphical representation of a multiple alignment, the height of the letter indicates<br />

both the frequency of an amino acid at a particular position and the degree of<br />

conservation at that position. This means that the height of the column where a<br />

single residue is always present is greater than that in less well-conserved<br />

positions.<br />

One such database, Prosite, gives a consensus pattern describing the<br />

conserved region around the active site serine of the trypsin-like serine<br />

proteases in the form of a regular expression as:<br />

1 2 3,4 5 6 7 8 9 10<br />

[DNSTAGC]-[GSTAPIMVQH]-x(2)-G-[DE]-S-G-[GS]-[SAPHV]-<br />

11 12<br />

[LIVMFYWH]-[LIVMFYSTANQH]<br />

In this representation a single letter indicates positions where only one<br />

amino acid is ever found in multiple alignments, such as positions 5, 7 and<br />

8. In this example the serine at position 7, shown here in blue, is the active<br />

site serine. Positions 6 and 9 allow one of two possible residues, namely<br />

aspartate or glutamate at position 6 and glycine or serine at position 9.<br />

Positions 1, 2, 10, 11 and 12 allow any one of the amino acids from the<br />

group in square brackets. Positions 3 and 4 can be occupied by any<br />

residues, as indicated by x(2). If you compare this regular expression to the<br />

consensus pattern derived from our multiple alignment you should see<br />

that it describes the conserved region from our multiple alignment but also<br />

allows for greater variation. This is because it was generated from a much<br />

larger data set that included more distantly related members of the trypsinlike<br />

serine protease family.<br />

Q8.18. Can you locate the sequence described by the following regular<br />

expression on the multiple alignment in Figure 8.14?<br />

[LIVM]-[ST]-A-[STAG]-H-C

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!