22.01.2015 Views

1. Introduction - Algorithms in Bioinformatics

1. Introduction - Algorithms in Bioinformatics

1. Introduction - Algorithms in Bioinformatics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Bio<strong>in</strong>formatics I, WS’09-10, D. Huson, November 26, 2009 1<br />

1 General <strong>Introduction</strong><br />

Bio<strong>in</strong>formatics I<br />

Module “Bio<strong>in</strong>formatics I” M.Sc. Bio<strong>in</strong>formatik<br />

Prof. Daniel Huson, Dr. Johannes Fischer,<br />

and Dr. Stefan Henz<br />

WS 2009/10<br />

Office hours: see personal webpages.<br />

<strong>1.</strong>1 Time and place<br />

Lectures:<br />

Mondays 10ct-12h A301, Sand 1<br />

Wednesdays 10ct-12h A301, Sand 1<br />

Problem sessions:<br />

Day Time Where<br />

Email:<br />

Daniel Huson huson at <strong>in</strong>formatik.uni-tueb<strong>in</strong>gen.de<br />

Stefan Henz stefan at henz@tueb<strong>in</strong>gen.mpg.de<br />

Johannes Fischer fischer at <strong>in</strong>formatik.uni-tueb<strong>in</strong>gen.de<br />

Website: www-ab.<strong>in</strong>formatik.uni-tueb<strong>in</strong>gen.de/teach<strong>in</strong>g/ws09/bio<strong>in</strong>formatics-i<br />

<strong>1.</strong>2 How to get credit for this course<br />

To pass this course you must:<br />

• Always participate <strong>in</strong> the weekly problem sessions and present your results regularly.<br />

• Obta<strong>in</strong> at least 60 % of all atta<strong>in</strong>able po<strong>in</strong>ts on the assignment sheets.<br />

• Pass the mid-term exam.<br />

• Pass the f<strong>in</strong>al exam.<br />

You may work on and hand-<strong>in</strong> assignments and projects <strong>in</strong> groups of up to two people.<br />

Dates (tentative): Mid-term exam on 14.1<strong>1.</strong>2009, f<strong>in</strong>al exam on 17.2.2010<br />

Grade calculation: 50% mid-term exam, 50% f<strong>in</strong>al exam.


2 Bio<strong>in</strong>formatics I, WS’09-10, D. Huson, November 26, 2009<br />

<strong>1.</strong>3 Course notes and assignments<br />

When possible, course notes (“the script”) will be handed out at the beg<strong>in</strong>n<strong>in</strong>g of each lecture. The<br />

lecture notes will also be made available on the course website.<br />

Assignment sheets will usually be handed out and published on the course web-site on Mondays.<br />

Assignments are due a week later. Solutions should be sent to the tutor by email or handed <strong>in</strong> before<br />

the beg<strong>in</strong>n<strong>in</strong>g of the lecture.<br />

<strong>1.</strong>4 Contents of the lecture<br />

Bio<strong>in</strong>formatics I: Sequences and mach<strong>in</strong>e learn<strong>in</strong>g<br />

Bio<strong>in</strong>formatics II: Structures and systems biology<br />

<strong>1.</strong>4.1 Overview Bio<strong>in</strong>formatics I<br />

• Builds on “Grundlagen der Bio<strong>in</strong>formatik”<br />

• Mandatory lecture for Msc. bio<strong>in</strong>formatics students<br />

• Focuses on algorithms for the analysis of biological primary sequences<br />

• <strong>Algorithms</strong>: dynamic programm<strong>in</strong>g, heuristics, mach<strong>in</strong>e learn<strong>in</strong>g<br />

<strong>1.</strong>4.2 Textbooks<br />

<strong>Introduction</strong> to Computational Biology by Michael Waterman<br />

<strong>Introduction</strong> to Computational Biology by Setubal / Maidanis<br />

Biological sequence analysis by Durb<strong>in</strong>, Eddy, Krogh and Mitchison<br />

Bio<strong>in</strong>formatics - The Mach<strong>in</strong>e Learn<strong>in</strong>g Approach by Pierre Baldi and Soren Brunak<br />

<strong>1.</strong>5 Summary of Grundlagen-der-Bio<strong>in</strong>formatik lecture<br />

• Pairwise alignments (Scor<strong>in</strong>g matrices, NW, SW)<br />

• Blast<br />

• Multiple alignments (SP score, star, progressive)<br />

• Phylogeny (UPGMA, NJ, Maximum Parsimony)<br />

• HMMS (CpG, Viterbi, supervised tra<strong>in</strong><strong>in</strong>g)<br />

• gene f<strong>in</strong>d<strong>in</strong>g (ORF prediction <strong>in</strong> procaryotes, GenScan)<br />

• RNA structures (Nuss<strong>in</strong>ov, Zuker)<br />

• Prote<strong>in</strong> secondary structures (classification, Chou-Fasman, SSP)<br />

• Prote<strong>in</strong> tertiary structures (classification, thread<strong>in</strong>g, de novo, comparison)


Bio<strong>in</strong>formatics I, WS’09-10, D. Huson, November 26, 2009 3<br />

• Bionformatics Databases<br />

• Microarray (Technology, Normalization, Cluster<strong>in</strong>g, Statistics)<br />

<strong>1.</strong>6 Overview Bio<strong>in</strong>formatics I<br />

<strong>1.</strong> Pairwise alignment (quick rem<strong>in</strong>der, aff<strong>in</strong>e gaps, k-band, l<strong>in</strong>ear space)<br />

2. Multiple alignment (T-Coffee, Muscle)<br />

3. BLAST and psi-BLAST, BLAT<br />

4. Phylogeny (ML and Bayesian, network methods)<br />

5. Suffix trees (Generation, searches, repeats)<br />

6. Motif f<strong>in</strong>d<strong>in</strong>g<br />

7. Hidden Markov Models (Tra<strong>in</strong><strong>in</strong>g, Viterbi Tra<strong>in</strong><strong>in</strong>g, Baum-Welch)<br />

8. Gene f<strong>in</strong>d<strong>in</strong>g (GenScan, Tw<strong>in</strong>scan)<br />

9. Support Vector Mach<strong>in</strong>es (subcellular location)<br />

10. Physical mapp<strong>in</strong>g (3 protocols, 3 algorithms)<br />

1<strong>1.</strong> Sequenc<strong>in</strong>g and assembly<br />

12. Population genetics

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!