22.01.2015 Views

1. Introduction - Algorithms in Bioinformatics

1. Introduction - Algorithms in Bioinformatics

1. Introduction - Algorithms in Bioinformatics

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Bio<strong>in</strong>formatics I, WS’09-10, D. Huson, November 26, 2009 1<br />

1 General <strong>Introduction</strong><br />

Bio<strong>in</strong>formatics I<br />

Module “Bio<strong>in</strong>formatics I” M.Sc. Bio<strong>in</strong>formatik<br />

Prof. Daniel Huson, Dr. Johannes Fischer,<br />

and Dr. Stefan Henz<br />

WS 2009/10<br />

Office hours: see personal webpages.<br />

<strong>1.</strong>1 Time and place<br />

Lectures:<br />

Mondays 10ct-12h A301, Sand 1<br />

Wednesdays 10ct-12h A301, Sand 1<br />

Problem sessions:<br />

Day Time Where<br />

Email:<br />

Daniel Huson huson at <strong>in</strong>formatik.uni-tueb<strong>in</strong>gen.de<br />

Stefan Henz stefan at henz@tueb<strong>in</strong>gen.mpg.de<br />

Johannes Fischer fischer at <strong>in</strong>formatik.uni-tueb<strong>in</strong>gen.de<br />

Website: www-ab.<strong>in</strong>formatik.uni-tueb<strong>in</strong>gen.de/teach<strong>in</strong>g/ws09/bio<strong>in</strong>formatics-i<br />

<strong>1.</strong>2 How to get credit for this course<br />

To pass this course you must:<br />

• Always participate <strong>in</strong> the weekly problem sessions and present your results regularly.<br />

• Obta<strong>in</strong> at least 60 % of all atta<strong>in</strong>able po<strong>in</strong>ts on the assignment sheets.<br />

• Pass the mid-term exam.<br />

• Pass the f<strong>in</strong>al exam.<br />

You may work on and hand-<strong>in</strong> assignments and projects <strong>in</strong> groups of up to two people.<br />

Dates (tentative): Mid-term exam on 14.1<strong>1.</strong>2009, f<strong>in</strong>al exam on 17.2.2010<br />

Grade calculation: 50% mid-term exam, 50% f<strong>in</strong>al exam.


2 Bio<strong>in</strong>formatics I, WS’09-10, D. Huson, November 26, 2009<br />

<strong>1.</strong>3 Course notes and assignments<br />

When possible, course notes (“the script”) will be handed out at the beg<strong>in</strong>n<strong>in</strong>g of each lecture. The<br />

lecture notes will also be made available on the course website.<br />

Assignment sheets will usually be handed out and published on the course web-site on Mondays.<br />

Assignments are due a week later. Solutions should be sent to the tutor by email or handed <strong>in</strong> before<br />

the beg<strong>in</strong>n<strong>in</strong>g of the lecture.<br />

<strong>1.</strong>4 Contents of the lecture<br />

Bio<strong>in</strong>formatics I: Sequences and mach<strong>in</strong>e learn<strong>in</strong>g<br />

Bio<strong>in</strong>formatics II: Structures and systems biology<br />

<strong>1.</strong>4.1 Overview Bio<strong>in</strong>formatics I<br />

• Builds on “Grundlagen der Bio<strong>in</strong>formatik”<br />

• Mandatory lecture for Msc. bio<strong>in</strong>formatics students<br />

• Focuses on algorithms for the analysis of biological primary sequences<br />

• <strong>Algorithms</strong>: dynamic programm<strong>in</strong>g, heuristics, mach<strong>in</strong>e learn<strong>in</strong>g<br />

<strong>1.</strong>4.2 Textbooks<br />

<strong>Introduction</strong> to Computational Biology by Michael Waterman<br />

<strong>Introduction</strong> to Computational Biology by Setubal / Maidanis<br />

Biological sequence analysis by Durb<strong>in</strong>, Eddy, Krogh and Mitchison<br />

Bio<strong>in</strong>formatics - The Mach<strong>in</strong>e Learn<strong>in</strong>g Approach by Pierre Baldi and Soren Brunak<br />

<strong>1.</strong>5 Summary of Grundlagen-der-Bio<strong>in</strong>formatik lecture<br />

• Pairwise alignments (Scor<strong>in</strong>g matrices, NW, SW)<br />

• Blast<br />

• Multiple alignments (SP score, star, progressive)<br />

• Phylogeny (UPGMA, NJ, Maximum Parsimony)<br />

• HMMS (CpG, Viterbi, supervised tra<strong>in</strong><strong>in</strong>g)<br />

• gene f<strong>in</strong>d<strong>in</strong>g (ORF prediction <strong>in</strong> procaryotes, GenScan)<br />

• RNA structures (Nuss<strong>in</strong>ov, Zuker)<br />

• Prote<strong>in</strong> secondary structures (classification, Chou-Fasman, SSP)<br />

• Prote<strong>in</strong> tertiary structures (classification, thread<strong>in</strong>g, de novo, comparison)


Bio<strong>in</strong>formatics I, WS’09-10, D. Huson, November 26, 2009 3<br />

• Bionformatics Databases<br />

• Microarray (Technology, Normalization, Cluster<strong>in</strong>g, Statistics)<br />

<strong>1.</strong>6 Overview Bio<strong>in</strong>formatics I<br />

<strong>1.</strong> Pairwise alignment (quick rem<strong>in</strong>der, aff<strong>in</strong>e gaps, k-band, l<strong>in</strong>ear space)<br />

2. Multiple alignment (T-Coffee, Muscle)<br />

3. BLAST and psi-BLAST, BLAT<br />

4. Phylogeny (ML and Bayesian, network methods)<br />

5. Suffix trees (Generation, searches, repeats)<br />

6. Motif f<strong>in</strong>d<strong>in</strong>g<br />

7. Hidden Markov Models (Tra<strong>in</strong><strong>in</strong>g, Viterbi Tra<strong>in</strong><strong>in</strong>g, Baum-Welch)<br />

8. Gene f<strong>in</strong>d<strong>in</strong>g (GenScan, Tw<strong>in</strong>scan)<br />

9. Support Vector Mach<strong>in</strong>es (subcellular location)<br />

10. Physical mapp<strong>in</strong>g (3 protocols, 3 algorithms)<br />

1<strong>1.</strong> Sequenc<strong>in</strong>g and assembly<br />

12. Population genetics

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!