02.05.2015 Views

Midterm Exam

Midterm Exam

Midterm Exam

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Look at this BLAST report below and answer the following questions. (10 points)<br />

A B C<br />

1. Zea mays leucine-rich repeat transmembrane protein kinase 2 (ltk2)... 499 e-140<br />

2. Gossypium arboreum 7-10 dpa fiber library Gossypium arboreum cDNA... 384 e-105<br />

3. Gossypium arboreum 7-10 dpa fiber library Gossypium arboreum cDNA... 373 e-102<br />

4. Six-day Cotton fiber Gossypium hirsutum 5' similar to LRR kinase1... 319 1e-85<br />

5. Arabidopsis thaliana DNA chromosome 4, contig fragment No. 56... 284 1e-85<br />

6. Arabidopsis thaliana DNA chromosome 4, BAC clone F1N20... 284 1e-85<br />

7. Six-day Cotton fiber Gossypium hirsutum 5' similar to LRR kinase 1.. 312 1e-83<br />

8. Gm-c1036-1924 5' similar to LRR TRANSMEMBRANE KINASE I.. 307 3e-82<br />

9. Nodulated root Medicago truncatula cDNA clone NF030H07NR 305 1e-81<br />

10.Arabidopsis thaliana chromosome 1 YAC YUP8H12R sequence,... 141 1e-80<br />

11.tomato fruit mature green, cDNA clone cLEF46B14 5', mRNA sequence ... 284 4e-75<br />

1. What is the meaning of the numbers in the column C and how are these numbers calculated, at<br />

least in a conceptual sense?<br />

Column C shows the Expectation Value, which is the likelihood of observing a sequence<br />

alignment with the Blast score (shown in column B) by chance -- given the corresponding<br />

alignment length and target database size.<br />

2. Look at hits 10 and 11. How is it possible that #10 (Arabidopsis…) has a lower value in column<br />

B, but a more significant value in column C -- compared with #11 (tomato fruit…), which has a<br />

higher value in column B but a less significant value in column C?<br />

The most likely reason that the two blast hits have similar expectation values (column<br />

C), but very different Blast scores (column B) is that the length of the underlying<br />

alignments are different. The alignment in line 10 is probably much short than the<br />

alignment in line 11.<br />

3. The query used in this search came from flax and the BLAST results seem to indicate it is some<br />

sort of leucine-rich repeat transmembrane kinase. How confident would you be of this<br />

functional assignment and why? What is one additional type of information that you could look<br />

for “informatically” that would increase your confidence? (There is no one right answer to the<br />

second part of this question).<br />

The prediction that the flax query sequence is probably a member of the leucine-rich<br />

repeat, transmembrane kinase because it has several Blast hit matches annotated as<br />

such. Only one hit, however, is an original annotation of a LRR-TM-kinase (top hit from<br />

maize, ltk1) – all of the others are derived annotations (similar to…). Therefore, the best<br />

evidence, and really the only one shown, is the top hit to ltk1, plus the fact that the<br />

expectation score for this hit is so very negative. The other Blast hits do help to reinforce<br />

the conclusion that the query belongs to a large, well-defined protein family, and the top<br />

hit indicates it is probably related to LRR-TM-kinases.<br />

5

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!