05.06.2015 Views

PPT - Molecular Modelling & Bioinformatics Group

PPT - Molecular Modelling & Bioinformatics Group

PPT - Molecular Modelling & Bioinformatics Group

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Bioinformatics</strong><br />

Structural and functional prediction Master in <strong>Molecular</strong><br />

Biotecnology 2010-11<br />

Ramon Goni<br />

rgoni@mmb.pcb.ub.es


• Introduction<br />

• Biological Databases<br />

• Sequence Comparison<br />

• 3D Structure visualization<br />

• Functional Prediction<br />

• Structural Prediction<br />

Outline<br />

http://mmb.pcb.ub.es/MBIOTEC/


Material and Evaluation<br />

• Exercises and slides<br />

– Campus Virtual<br />

– http://mmb.pcb.ub.es/MBIOTEC<br />

• Evaluation.<br />

– Practical test on Campus Virtual.


<strong>Bioinformatics</strong>: Overview


• <strong>Bioinformatics</strong> or computational biology is the use of<br />

techniques from applied mathematics, informatics, statistics, and<br />

computer science to solve biological problems.<br />

• A common thread in projects in bioinformatics and computational<br />

biology is the use of mathematical tools to extract useful<br />

information from noisy data produced by high-throughput<br />

biological techniques. (The field of data mining overlaps with<br />

computational biology in this regard.)<br />

• Major research efforts in the field include sequence alignment,<br />

gene finding, genome assembly, protein structure alignment,<br />

protein structure prediction, prediction of gene expression and<br />

protein-protein interactions, and the modeling of evolution.<br />

• The terms bioinformatics and computational biology are often<br />

used interchangeably, although the latter typically focuses on<br />

algorithm development and specific computational methods.


• <strong>Bioinformatics</strong>: Research, development, or application of<br />

computational tools and approaches for expanding the use<br />

of biological, medical, behavioral or health data, including<br />

those to acquire store, organize, archive, analyze, or<br />

visualize such data.<br />

• Computational Biology: The development and application<br />

of data-analytical and theoretical methods, mathematical<br />

modeling and computational simulation techniques to the<br />

study of biological behavioral, and social systems


The Human Genome Project


Tools for <strong>Bioinformatics</strong><br />

Genome<br />

sequencing<br />

• DNA Sequence<br />

Biostatistics<br />

Genomic<br />

data<br />

• Gene & Genome<br />

Organization<br />

Protein<br />

structure<br />

prediction<br />

• <strong>Molecular</strong> Evolution<br />

analysis<br />

Protein<br />

docking<br />

• RNA/Protein<br />

Structure, Function<br />

& Interaction<br />

<strong>Molecular</strong><br />

dynamics<br />

Virtual<br />

screening<br />

• Metabolic Pathways<br />

MicroArrays<br />

• Regulation,<br />

Signaling &<br />

Networks


<strong>Bioinformatics</strong><br />

Computer Tools<br />

Models<br />

Experimental<br />

Work<br />

Biology<br />

<strong>Bioinformatics</strong><br />

Work<br />

Computer<br />

Science<br />

Knowledge<br />

Databases


• Growth of Data<br />

makes unviable to<br />

work manually<br />

with it<br />

• Automated<br />

methods require a<br />

previous step to<br />

store information<br />

• Working with<br />

automated tools<br />

increase the<br />

amount of data<br />

Benefits of Biological Databases


Growth of the<br />

Databases<br />

Data<br />

Information<br />

Information<br />

Data


<strong>Bioinformatics</strong> Tools


<strong>Bioinformatics</strong> Tools<br />

Linux<br />

Perl


Sequence Alignment<br />

tcctctgcctctgccatcat---caaccccaaagt<br />

|||| ||| ||||| ||||| ||||||||||||<br />

tcctgtgcatctgcaatcatgggcaaccccaaagt<br />

Hidden Markov Model<br />

Dynamic Programming


Blosum Matrix<br />

Protein Sequence Alignent


Multiple Alignment<br />

ClustalW http://www.ebi.ac.uk/clustalw/


BLAST


GenScan<br />

http://genes.mit.edu/GENSCAN.html


GenScan


http://meme.sdsc.edu/meme/website/meme.html


Protein Structure Prediction


Ab Initio<br />

>MyProtein<br />

MAVTQTAQACDLVIFGAKGDLARRKLLPSLYQLEKAGQLNPDTRIIGVGRADWDKAA<br />

YTKVVREALETFMKETIDEGLWDTLSARLDFCNLDVNDTAAFSRLGAMLDQKNRITI<br />

NYFAMPPSTFGAICKGLGEAKLNAKPARVVMEKPLGTSLATSQEINDQVGEYFEECQ<br />

VYRIDHYLGKETVLNLLALRFANSLFVNNWDNRTIDHVEITV


Homology Modeling<br />

>MyProtein<br />

MAVTQTAQACDLVIFGAKGDLARRKLLPSLYQLEKAGQLNPDTRIIGVGRADWD<br />

KAAYTKVVREALETFMKETIDEGLWDTLSARLDFCNLDVNDTAAFSRLGAMLDQ<br />

KNRITINYFAMPPSTFGAICKGLGEAKLNAKPARVVMEKPLGTSLATSQEINDQ<br />

VGEYFEECQVYRIDHYLGKETVLNLLALRFANSLFVNNWDNRTIDHVEITV


Threading / Fold Recognition<br />

>MyProtein<br />

MAVTQTAQACDLVIFGAKGDLARRKLLPSLYQLEKAGQLNPDTRIIGVGRADWD<br />

KAAYTKVVREALETFMKETIDEGLWDTLSARLDFCNLDVNDTAAFSRLGAMLDQ<br />

KNRITINYFAMPPSTFGAICKGLGEAKLNAKPARVVMEKPLGTSLATSQEINDQ<br />

VGEYFEECQVYRIDHYLGKETVLNLLALRFANSLFVNNWDNRTIDHVEITV


Structure Alignment<br />

http://www.ebi.ac.uk/dali/index.html


RNA Secondary Structure


Vienna RNA Package<br />

RNAFold<br />

Michael Zuker's<br />

RNAstructure


DNA Structure<br />

• NewHelix, Freehelix (Dickerson)<br />

• 3-DNA (Olson)<br />

• Curves (Lavery)


Protein function: Protein Docking


<strong>Molecular</strong> Dynamics


<strong>Bioinformatics</strong> Databases


Access to Databases<br />

Application<br />

Programming<br />

Interfaces<br />

(APIs) are an<br />

exclusive<br />

machineoriented<br />

tool<br />

for high<br />

throughput<br />

jobs. Are<br />

rarely provided<br />

by Databases<br />

S<br />

Q<br />

L<br />

Access to databases<br />

using web-interface<br />

is user oriented.<br />

Allow to be rapidly<br />

familiar with the<br />

environment and is<br />

recommended for<br />

sporadic and<br />

specific jobs<br />

Not all the <strong>Bioinformatics</strong><br />

Databases provide this service.<br />

SQL is an universal, and very<br />

popular user-oriented and<br />

machine-oriented interface. May<br />

be used for any kind of job


FTP accession


FlatFiles


Databases<br />

Conflicts<br />

Gene<br />

Prediction<br />

NCBI/<br />

EnsEMBL<br />

SEQUENCE<br />

mRNA<br />

Mass<br />

Spectrometry<br />

NCBI/<br />

EnsEMBL<br />

SWISSPROT<br />

ALINEAMENT<br />

Crystal<br />

Diffraction<br />

PDB


DNA Databases


Major Public Databases<br />

European <strong>Molecular</strong> Biology Laboratory (EMBL)<br />

European <strong>Bioinformatics</strong> Institute (EBI)<br />

http://www.embl.org<br />

Dna DataBase of<br />

Japan (DDBJ)<br />

http://www.ddbj.nig.ac.jp/<br />

National Center for Biotechnology<br />

Information (NCBI)<br />

http://www.ncbi.nlm.nih.gov/


Accessing genome sequence data<br />

ENSEMBL (EMB + Sanger Institute)<br />

http://www.ensembl.org/<br />

GenBank (NCBI)<br />

http://www.ncbi.nlm.nih.gov/<br />

The UCSC Genome Browser<br />

(University California Santa Cruz)<br />

http://genome.ucsc.edu


UCSC Genome Browser


NCBI


We look forward to working with you all in the future to continue this<br />

tradition as the database continues to grow exponentially


Genome Browser<br />

Show Annotations<br />

Assembly<br />

(Genes, Conserved Regions,<br />

CpG Islands,…)


OMIM


PubMed http://www.pubmed.com


ENSEMBL


• Browsing & Entry Retrieval<br />

• Literature Databases<br />

• Microarray Databases<br />

• Nucleotide Databases<br />

• Protein Databases<br />

• Proteomic Databases<br />

• Structure Databases<br />

http://www.ebi.ac.uk


http://www.sanger.ac.uk/


TRANSFAC<br />

http://www.gene-regulation.com/pub/databases.html


Protein Databases


http://www.isb-sib.ch/<br />

http://www.expasy.org


Swiss-Prot<br />

TrEMBL


http://www.ebi.uniprot.org/index.shtml


Protein Data Base http://www.pdb.org


PDB entry


http://www.expasy.org/prosite/


Protein families database<br />

of alignments and HMMs<br />

http://www.sanger.ac.uk/Software/Pfam/index.shtml


http://www.ebi.ac.uk/interpro/


http://www.cathdb.info/latest/index.htm<br />

l


Gene Ontology http://www.geneontology.org/


Databases<br />

Nucleotide Sequence Databases<br />

RNA sequence databases<br />

Protein sequence databases<br />

Structure Databases<br />

Genomics Databases (non-vertebrate)<br />

Metabolic and Signaling Pathways<br />

Human and other Vertebrate Genomes<br />

Human Genes and Diseases<br />

Microarray Data and other Gene Expression Databases<br />

Proteomics Resources<br />

Other <strong>Molecular</strong> Biology Databases<br />

Organelle databases<br />

Plant databases<br />

Immunological databases<br />

http://nar.oxfordjournals.org/content/vol33/suppl_1/index.dtl


State-of-the-art in <strong>Bioinformatics</strong><br />

Nature Biotech:<br />

• http://www.nature.com/nbt/index.html<br />

<strong>Bioinformatics</strong>:<br />

• http://bioinformatics.oxfordjournals.org/<br />

Nucleic Acids Research:<br />

• http://nar.oxfordjournals.org/<br />

Journal of <strong>Molecular</strong> Biology:<br />

• http://www.academicpress.com/jmb<br />

Plos Computational Biology:<br />

• http://compbiol.plosjournals.org/<br />

BMC <strong>Bioinformatics</strong>:<br />

• http://www.biomedcentral.com/bmcbioinformatics


Institut Nacional de Bioinformàtica<br />

http://www.inab.org/


INB Map


http://genome.imim.es/software/geneid/geneid.html


http://bioinfo.ochoa.fib.es/


INB: BioMOBY


INB:Taberna http://http://taverna.sourceforge.net/

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!