07.06.2015 Views

CPGR Workshop – Tutorials Tutorial 1 – Evaluating loci to be ...

CPGR Workshop – Tutorials Tutorial 1 – Evaluating loci to be ...

CPGR Workshop – Tutorials Tutorial 1 – Evaluating loci to be ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>CPGR</strong> <strong>Workshop</strong> – <strong><strong>Tu<strong>to</strong>rial</strong>s</strong><br />

<strong>Tu<strong>to</strong>rial</strong> 1 – <strong>Evaluating</strong> <strong>loci</strong> <strong>to</strong> <strong>be</strong> candidate diagnostic markers.<br />

Background: A paper from several years ago compared the fully sequenced<br />

Xanthomonas genomes that were available at the time and found unique <strong>loci</strong><br />

from each genome. These unique <strong>loci</strong> would make excellent candidates for<br />

diagnostic markers <strong>to</strong> detect individual pathovars using PCR. Since the paper<br />

was published, several more Xanthomonas genomes have <strong>be</strong>en sequenced.<br />

Before developing the diagnostic markers, we want <strong>to</strong> make sure the <strong>loci</strong> are still<br />

good candidates.<br />

The GenBank Accessions are: AAW77501, BAE69434, AAM38926.<br />

Summary of workflow:<br />

• Search the locus accession in GenBank.<br />

• Examine the GenBank record <strong>to</strong> confirm the species/pathovar, and<br />

examine fields of interest: locus name, function, etc<br />

• Display the nucleotide sequence of the gene in FASTA format.<br />

• Use the NCBI BLAST server <strong>to</strong> search all the Xanthomonas sequences in<br />

the non-redundant (nr) nucleotide database at a low expect value (1e-5).<br />

• Interpret the BLAST report, <strong>to</strong> determine if the locus sequence is unique<br />

and therefore still a valid candidate.<br />

• Run the search again using the same settings as <strong>be</strong>fore on the NCBI<br />

BLAST server. However, change the expect value <strong>to</strong> 10.<br />

• Interpret the BLAST report. Note any differences in the BLAST report.<br />

Detailed Instructions:<br />

1. First, the nucleotide sequence of the locus must <strong>be</strong> obtained from Genbank at<br />

NCBI. Open the NCBI website in your browser and type the accession in the<br />

search box at the <strong>to</strong>p of the web page.<br />

2. The results page is from Entrez, a <strong>to</strong>ol from NCBI that searches across all the<br />

NCBI databases. You should see the results as shown in the screenshot <strong>be</strong>low:<br />

1 match in the Protein database and 1 match in the Gene Database. Both<br />

databases will take you <strong>to</strong> the protein sequence. In this case, we’ll look at the<br />

Protein database entry (but feel free <strong>to</strong> look at the Gene database record for the<br />

locus). Click the Protein: sequence database on the Entrez results page.


3. This displays a summary of the locus record in the protein database.<br />

Click the accession num<strong>be</strong>r <strong>to</strong> display the Genbank record for the locus.<br />

4. Take a look at the general features of the GenBank record.<br />

Can you identify the organism, function, and locus name?<br />

5. Currently we have the protein GenBank record for the gene. However, we<br />

need the nucleotide coding sequence for the gene <strong>to</strong> carry out the analysis. To<br />

obtain this, scroll down <strong>to</strong> the FEATURES section of the Genbank record. You<br />

should see a link called ‘CDS’ (coding sequence or ORF). Look at the features<br />

of the CDS. Can you identify the gene name, the genomic location, and the<br />

method the CDS was identified? Now, click on the CDS link <strong>to</strong> obtain the<br />

nucleotide sequence of the locus.


6. Now you have a GenBank record for the nucleotide sequence of the gene.<br />

Take a moment <strong>to</strong> look at the record. Can you tell what strand the locus is on in<br />

the genomic sequence?<br />

7. Now the GenBank record has <strong>to</strong> <strong>be</strong> converted <strong>to</strong> the FASTA format in order <strong>to</strong><br />

<strong>be</strong> accepted by the web <strong>to</strong>ols we will <strong>be</strong> using, such as the BLAST server. At the<br />

<strong>to</strong>p left hand corner of the page, there is a drop down box la<strong>be</strong>led Display:. Click<br />

the drop down box and select FASTA.


8. You should now have the nucleotide sequence of the locus in FASTA format<br />

displayed in the browser.<br />

9. When doing a bioinformatics analysis, it is useful <strong>to</strong> keep of log of downloaded<br />

sequences, results, and output in a text edi<strong>to</strong>r. This prevents losing work if a<br />

browser crashes or you want <strong>to</strong> save your work. To do this, open notepad in<br />

windows (Start -> Run – type notepad in the “Open:” box). The FASTA sequence<br />

can <strong>be</strong> copied and pasted straight from the browser in<strong>to</strong> notepad.


10. Now that we have the sequence of the locus in FASTA format, we can use<br />

the NCBI BLAST server <strong>to</strong> search the locus against all the Xanthomonas<br />

sequences in GenBank. Go back <strong>to</strong> the <strong>CPGR</strong> <strong>Workshop</strong> Links page and open<br />

the NCBI BLAST Page in the browser. To <strong>be</strong>gin, paste the sequence in the text<br />

box. The locus sequence is the query sequence and will <strong>be</strong> used <strong>to</strong> search<br />

against the database of non-redundant GenBank sequences. The sequences in<br />

the database are also known as the subject.<br />

11. Searching against the entire non-redundant GenBank set of sequences is <strong>to</strong>o<br />

wide in this case as we want <strong>to</strong> limit the search <strong>to</strong> all of the Xanthomonas<br />

sequences in GenBank. To do this, scroll down <strong>to</strong> the Choose Search Set panel<br />

and choose the ‘Others’ Database option. Next, select the ‘Cus<strong>to</strong>m’ option for the<br />

Organism. Start typing Xanthomonas in the text box. As you type, a drop drown<br />

box will appear showing the available options as your typing narrows them down.<br />

When you see Xanathomonas (taxid:338) appear, select it.


12. Now we have <strong>to</strong> choose the correct BLAST program. We will discuss this in<br />

more detail <strong>to</strong>morrow. For now choose blastn, which searches a nucleotide<br />

sequence against a nucleotide database of sequences.<br />

13. Set the expect threshold <strong>to</strong> 1e-5. In this case, I also set the max target<br />

sequences <strong>to</strong> display at 10 <strong>to</strong> prevent a large amount of data coming back from<br />

the BLAST server.


14. Click the large ‘BLAST’ but<strong>to</strong>n <strong>to</strong> start the search. In a minute or so you<br />

should get a BLAST results report loaded in<strong>to</strong> the browser. A portion of which is<br />

shown <strong>be</strong>low. You can also copy and paste the BLAST report in<strong>to</strong> notepad and<br />

save it. You may want <strong>to</strong> refer <strong>to</strong> the BLAST report during the lecture <strong>to</strong>morrow.<br />

15. Although we will cover the blast report in detail <strong>to</strong>morrow, you should <strong>be</strong> able<br />

<strong>to</strong> determine if the locus is unique <strong>to</strong> the Xanthomonas pathovar? Is the locus a<br />

good candidate for marker development. Why or why not?<br />

16. Now, press the back but<strong>to</strong>n on the browser <strong>to</strong> go back <strong>to</strong> the BLAST input<br />

page. Change the expect threshold <strong>to</strong> 10 and run the search again. Does the<br />

BLAST report change? Record your observations for discussion <strong>to</strong>morrow.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!