BLAST, BLAT and FASTA - Algorithms in Bioinformatics
BLAST, BLAT and FASTA - Algorithms in Bioinformatics
BLAST, BLAT and FASTA - Algorithms in Bioinformatics
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
52 Bio<strong>in</strong>formatics I, WS’09-10, S. Henz (script by D. Huson) November 26, 2009<br />
• <strong>BLAST</strong>P: compares a prote<strong>in</strong> query sequence to a prote<strong>in</strong> sequence database<br />
• T<strong>BLAST</strong>N: compares a prote<strong>in</strong> query sequence to a DNA sequence database (6 frames translation)<br />
• <strong>BLAST</strong>X: compares a DNA query sequence (6 frames translation) to a prote<strong>in</strong> sequence database<br />
• T<strong>BLAST</strong>X: compares a DNA query sequence (6 frames translation) to a DNA sequence database (6<br />
frames translation)<br />
• Phi-<strong>BLAST</strong>: Pattern Hit Initiated <strong>BLAST</strong><br />
searches for particular patterns <strong>in</strong> prote<strong>in</strong> queries, <strong>in</strong>corp. <strong>in</strong>to PSI-Blast<br />
• PSI-<strong>BLAST</strong>: Position specific iterated <strong>BLAST</strong><br />
– profile of hits is computed<br />
– database is searched with profile<br />
– many iterations<br />
– designed to detect weak relationships between the query <strong>and</strong> members of the database not<br />
necessarily detectable by st<strong>and</strong>ard <strong>BLAST</strong> searches.<br />
– results <strong>in</strong> <strong>in</strong>creased sensitivity<br />
4.8 Available <strong>BLAST</strong> implementations<br />
• NCBI <strong>BLAST</strong>: Implementation of all <strong>BLAST</strong> programs ma<strong>in</strong>ta<strong>in</strong>ed by NCBI.<br />
• AB-<strong>BLAST</strong> (former WU-<strong>BLAST</strong>): Alternative implementation of all <strong>BLAST</strong> programs (except<br />
for PHI- <strong>and</strong> PSI-<strong>BLAST</strong>) but the other <strong>BLAST</strong> families are<br />
4.9 <strong>BLAT</strong><br />
<strong>BLAT</strong> = Blast Like Alignment Tool 3<br />
Motivation for the development of <strong>BLAT</strong>:<br />
For public assembly of the human genome 3 million ESTs <strong>and</strong> 13 million whole genome shotgun reads<br />
needed to be mapped to the human genome.<br />
For EST aga<strong>in</strong>st genome alignment: 1.75 Gb <strong>in</strong> 3.72 million ESTs aga<strong>in</strong>st 2.88 Gb bases of Human<br />
DNA.<br />
Application <strong>in</strong> particular for large query sequences, eg. genomes<br />
Analyz<strong>in</strong>g vertebrate genomes requires rapid mRNA/DNA <strong>and</strong> cross-species prote<strong>in</strong> alignments.<br />
<strong>BLAT</strong> is especially designed for very fast <strong>and</strong> accurate alignments of both DNA <strong>and</strong> prote<strong>in</strong> sequences.<br />
<strong>BLAST</strong> preprocesses the query.<br />
<strong>BLAT</strong> preprocesses the database: <strong>in</strong>dex of all non-overlapp<strong>in</strong>g K-mers <strong>in</strong> db (genome)<br />
Several stages:<br />
• use <strong>in</strong>dex to f<strong>in</strong>d regions <strong>in</strong> the genome that are possibly homologous to the query sequence.<br />
• perform an alignment between such regions.<br />
• stitch together the aligned regions (often exons) <strong>in</strong>to larger alignments (typically genes).<br />
3 W. J. Kent: <strong>BLAT</strong> - The <strong>BLAST</strong>-Like Alignment Tool. Genome Res. 12:656-664(2002)