PPT - Molecular Modelling & Bioinformatics Group
PPT - Molecular Modelling & Bioinformatics Group
PPT - Molecular Modelling & Bioinformatics Group
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Bioinformatics</strong><br />
Structural and functional prediction Master in <strong>Molecular</strong><br />
Biotecnology 2010-11<br />
Ramon Goni<br />
rgoni@mmb.pcb.ub.es
• Introduction<br />
• Biological Databases<br />
• Sequence Comparison<br />
• 3D Structure visualization<br />
• Functional Prediction<br />
• Structural Prediction<br />
Outline<br />
http://mmb.pcb.ub.es/MBIOTEC/
Material and Evaluation<br />
• Exercises and slides<br />
– Campus Virtual<br />
– http://mmb.pcb.ub.es/MBIOTEC<br />
• Evaluation.<br />
– Practical test on Campus Virtual.
<strong>Bioinformatics</strong>: Overview
• <strong>Bioinformatics</strong> or computational biology is the use of<br />
techniques from applied mathematics, informatics, statistics, and<br />
computer science to solve biological problems.<br />
• A common thread in projects in bioinformatics and computational<br />
biology is the use of mathematical tools to extract useful<br />
information from noisy data produced by high-throughput<br />
biological techniques. (The field of data mining overlaps with<br />
computational biology in this regard.)<br />
• Major research efforts in the field include sequence alignment,<br />
gene finding, genome assembly, protein structure alignment,<br />
protein structure prediction, prediction of gene expression and<br />
protein-protein interactions, and the modeling of evolution.<br />
• The terms bioinformatics and computational biology are often<br />
used interchangeably, although the latter typically focuses on<br />
algorithm development and specific computational methods.
• <strong>Bioinformatics</strong>: Research, development, or application of<br />
computational tools and approaches for expanding the use<br />
of biological, medical, behavioral or health data, including<br />
those to acquire store, organize, archive, analyze, or<br />
visualize such data.<br />
• Computational Biology: The development and application<br />
of data-analytical and theoretical methods, mathematical<br />
modeling and computational simulation techniques to the<br />
study of biological behavioral, and social systems
The Human Genome Project
Tools for <strong>Bioinformatics</strong><br />
Genome<br />
sequencing<br />
• DNA Sequence<br />
Biostatistics<br />
Genomic<br />
data<br />
• Gene & Genome<br />
Organization<br />
Protein<br />
structure<br />
prediction<br />
• <strong>Molecular</strong> Evolution<br />
analysis<br />
Protein<br />
docking<br />
• RNA/Protein<br />
Structure, Function<br />
& Interaction<br />
<strong>Molecular</strong><br />
dynamics<br />
Virtual<br />
screening<br />
• Metabolic Pathways<br />
MicroArrays<br />
• Regulation,<br />
Signaling &<br />
Networks
<strong>Bioinformatics</strong><br />
Computer Tools<br />
Models<br />
Experimental<br />
Work<br />
Biology<br />
<strong>Bioinformatics</strong><br />
Work<br />
Computer<br />
Science<br />
Knowledge<br />
Databases
• Growth of Data<br />
makes unviable to<br />
work manually<br />
with it<br />
• Automated<br />
methods require a<br />
previous step to<br />
store information<br />
• Working with<br />
automated tools<br />
increase the<br />
amount of data<br />
Benefits of Biological Databases
Growth of the<br />
Databases<br />
Data<br />
Information<br />
Information<br />
Data
<strong>Bioinformatics</strong> Tools
<strong>Bioinformatics</strong> Tools<br />
Linux<br />
Perl
Sequence Alignment<br />
tcctctgcctctgccatcat---caaccccaaagt<br />
|||| ||| ||||| ||||| ||||||||||||<br />
tcctgtgcatctgcaatcatgggcaaccccaaagt<br />
Hidden Markov Model<br />
Dynamic Programming
Blosum Matrix<br />
Protein Sequence Alignent
Multiple Alignment<br />
ClustalW http://www.ebi.ac.uk/clustalw/
BLAST
GenScan<br />
http://genes.mit.edu/GENSCAN.html
GenScan
http://meme.sdsc.edu/meme/website/meme.html
Protein Structure Prediction
Ab Initio<br />
>MyProtein<br />
MAVTQTAQACDLVIFGAKGDLARRKLLPSLYQLEKAGQLNPDTRIIGVGRADWDKAA<br />
YTKVVREALETFMKETIDEGLWDTLSARLDFCNLDVNDTAAFSRLGAMLDQKNRITI<br />
NYFAMPPSTFGAICKGLGEAKLNAKPARVVMEKPLGTSLATSQEINDQVGEYFEECQ<br />
VYRIDHYLGKETVLNLLALRFANSLFVNNWDNRTIDHVEITV
Homology Modeling<br />
>MyProtein<br />
MAVTQTAQACDLVIFGAKGDLARRKLLPSLYQLEKAGQLNPDTRIIGVGRADWD<br />
KAAYTKVVREALETFMKETIDEGLWDTLSARLDFCNLDVNDTAAFSRLGAMLDQ<br />
KNRITINYFAMPPSTFGAICKGLGEAKLNAKPARVVMEKPLGTSLATSQEINDQ<br />
VGEYFEECQVYRIDHYLGKETVLNLLALRFANSLFVNNWDNRTIDHVEITV
Threading / Fold Recognition<br />
>MyProtein<br />
MAVTQTAQACDLVIFGAKGDLARRKLLPSLYQLEKAGQLNPDTRIIGVGRADWD<br />
KAAYTKVVREALETFMKETIDEGLWDTLSARLDFCNLDVNDTAAFSRLGAMLDQ<br />
KNRITINYFAMPPSTFGAICKGLGEAKLNAKPARVVMEKPLGTSLATSQEINDQ<br />
VGEYFEECQVYRIDHYLGKETVLNLLALRFANSLFVNNWDNRTIDHVEITV
Structure Alignment<br />
http://www.ebi.ac.uk/dali/index.html
RNA Secondary Structure
Vienna RNA Package<br />
RNAFold<br />
Michael Zuker's<br />
RNAstructure
DNA Structure<br />
• NewHelix, Freehelix (Dickerson)<br />
• 3-DNA (Olson)<br />
• Curves (Lavery)
Protein function: Protein Docking
<strong>Molecular</strong> Dynamics
<strong>Bioinformatics</strong> Databases
Access to Databases<br />
Application<br />
Programming<br />
Interfaces<br />
(APIs) are an<br />
exclusive<br />
machineoriented<br />
tool<br />
for high<br />
throughput<br />
jobs. Are<br />
rarely provided<br />
by Databases<br />
S<br />
Q<br />
L<br />
Access to databases<br />
using web-interface<br />
is user oriented.<br />
Allow to be rapidly<br />
familiar with the<br />
environment and is<br />
recommended for<br />
sporadic and<br />
specific jobs<br />
Not all the <strong>Bioinformatics</strong><br />
Databases provide this service.<br />
SQL is an universal, and very<br />
popular user-oriented and<br />
machine-oriented interface. May<br />
be used for any kind of job
FTP accession
FlatFiles
Databases<br />
Conflicts<br />
Gene<br />
Prediction<br />
NCBI/<br />
EnsEMBL<br />
SEQUENCE<br />
mRNA<br />
Mass<br />
Spectrometry<br />
NCBI/<br />
EnsEMBL<br />
SWISSPROT<br />
ALINEAMENT<br />
Crystal<br />
Diffraction<br />
PDB
DNA Databases
Major Public Databases<br />
European <strong>Molecular</strong> Biology Laboratory (EMBL)<br />
European <strong>Bioinformatics</strong> Institute (EBI)<br />
http://www.embl.org<br />
Dna DataBase of<br />
Japan (DDBJ)<br />
http://www.ddbj.nig.ac.jp/<br />
National Center for Biotechnology<br />
Information (NCBI)<br />
http://www.ncbi.nlm.nih.gov/
Accessing genome sequence data<br />
ENSEMBL (EMB + Sanger Institute)<br />
http://www.ensembl.org/<br />
GenBank (NCBI)<br />
http://www.ncbi.nlm.nih.gov/<br />
The UCSC Genome Browser<br />
(University California Santa Cruz)<br />
http://genome.ucsc.edu
UCSC Genome Browser
NCBI
We look forward to working with you all in the future to continue this<br />
tradition as the database continues to grow exponentially
Genome Browser<br />
Show Annotations<br />
Assembly<br />
(Genes, Conserved Regions,<br />
CpG Islands,…)
OMIM
PubMed http://www.pubmed.com
ENSEMBL
• Browsing & Entry Retrieval<br />
• Literature Databases<br />
• Microarray Databases<br />
• Nucleotide Databases<br />
• Protein Databases<br />
• Proteomic Databases<br />
• Structure Databases<br />
http://www.ebi.ac.uk
http://www.sanger.ac.uk/
TRANSFAC<br />
http://www.gene-regulation.com/pub/databases.html
Protein Databases
http://www.isb-sib.ch/<br />
http://www.expasy.org
Swiss-Prot<br />
TrEMBL
http://www.ebi.uniprot.org/index.shtml
Protein Data Base http://www.pdb.org
PDB entry
http://www.expasy.org/prosite/
Protein families database<br />
of alignments and HMMs<br />
http://www.sanger.ac.uk/Software/Pfam/index.shtml
http://www.ebi.ac.uk/interpro/
http://www.cathdb.info/latest/index.htm<br />
l
Gene Ontology http://www.geneontology.org/
Databases<br />
Nucleotide Sequence Databases<br />
RNA sequence databases<br />
Protein sequence databases<br />
Structure Databases<br />
Genomics Databases (non-vertebrate)<br />
Metabolic and Signaling Pathways<br />
Human and other Vertebrate Genomes<br />
Human Genes and Diseases<br />
Microarray Data and other Gene Expression Databases<br />
Proteomics Resources<br />
Other <strong>Molecular</strong> Biology Databases<br />
Organelle databases<br />
Plant databases<br />
Immunological databases<br />
http://nar.oxfordjournals.org/content/vol33/suppl_1/index.dtl
State-of-the-art in <strong>Bioinformatics</strong><br />
Nature Biotech:<br />
• http://www.nature.com/nbt/index.html<br />
<strong>Bioinformatics</strong>:<br />
• http://bioinformatics.oxfordjournals.org/<br />
Nucleic Acids Research:<br />
• http://nar.oxfordjournals.org/<br />
Journal of <strong>Molecular</strong> Biology:<br />
• http://www.academicpress.com/jmb<br />
Plos Computational Biology:<br />
• http://compbiol.plosjournals.org/<br />
BMC <strong>Bioinformatics</strong>:<br />
• http://www.biomedcentral.com/bmcbioinformatics
Institut Nacional de Bioinformàtica<br />
http://www.inab.org/
INB Map
http://genome.imim.es/software/geneid/geneid.html
http://bioinfo.ochoa.fib.es/
INB: BioMOBY
INB:Taberna http://http://taverna.sourceforge.net/