What is a domain?

xray.bmc.uu.se

What is a domain?

Protein Engineering 15hp

Protein Structure Function

christofer@xray.bmc.uu.se


The Protein Domain

2


What is a domain

An evolutionary conserved, independently folding

unit within a protein.

A domain often, but not always, consist of a continuos segment

of amino-acid sequence.

Domains vary in size, but are usually around 200 amino-acids.

A domain can have an independent function or contribute to

the function of a multi-domain protein.

3


Some proteins are made of a single domain but the

majority of proteins are multi-domain proteins.

Single-domain protein

Multi-domain protein

N-terminal

Hetrotrimeric proteinLinker regions

C-terminal


Alpha & Beta Domains

5


Alpha domains are protein domains

composed entirely of alpha helices.

Beta domains are protein domains

composed entirely of beta sheets.

6


Alpha domains

Myohemerythrin (2mhr)

http://www.pdb.org

7


Alpha domains

Myoglobin (1a6k)


Beta domains

Immunoglobulin Light chain (1cfv)

9


Beta domains

Neuraminidase One subunit (1a4q)

10


Beta domains

Transthyretin One subunit (1tta)


Beta domains

Satellite Tobacco Necrosis Virus Coat Protein (2buk)

12


Alpha/Beta, Alpha+Beta

& Cross-Linked Domains

13


Alpha/beta domains are protein domains

composed of beta strands connected by alpha

helices.

Alpha+Beta domains are protein domains

composed of separate alpha-helical and betasheet

regions.

Cross-linked domains are protein domains with

little or no secondary structure and stabilized by

disulphide bridges or metal ions.

14


Alpha/Beta domains

Triosephosphate isomerase (TIM) One subunit (1tim)

15


Alpha/Beta domains

Aspartate semi-aldehyde dehydrogenase One domain (1brm)

16


Alpha+Beta domains

TATA-binding protein (1tgh)

17


Cross-linked domains

Neurotoxin from Brazilian scorpion Tityus serrulatus (1b7d)

18


Cross-linked domains

Human Zinc-finger DNA-binding domain (3znf)

19


Protein Interaction Domains

20


Protein Interaction Domains

Human C-SRC Tyrosine kinase SH2 domain (1shd)

21


Protein Interaction Domains

Human Calmodulin (1cll)

22


Protein Interaction Domains

Yeast Ski8p (1sq9)

23


CATH Protein Structure Classification

The CATH database is a hierarchical domain classification

of protein structures.

C class

Alpha domain, Beta domain, Alpha-Beta domain etc.

A architecture

Barrel, Sandwich, Propeller, Bundle etc.

http://www.cathdb.info

T topology (fold family)

Structures are grouped into fold groups at this level

depending on both the overall shape and connectivity of the

secondary structures.

H homologous superfamily

This level groups together protein domains which are

thought to share a common ancestor and can therefore be

described as homologous.

24


The Universe of

Protein Structures

25


The number of protein folds is large but limited.

Protein structures are modular and proteins can

be grouped into families on the basis of the

domains they contain.

The modular nature of protein structure allows

for sequence insertions and deletions.

26


Domain superfamilies

Multi-domain proteins

Supradomain


Domain superfamilies

Different geometries

Different functions


Mycobacterium tuberculosis

IdeR Iron dependent Regulator


Hepatitis C virus

NS3 Protease/Helicase


Soybean

Lipoxygenase-1


Felis domesticus

Pyruvate kinase


Saccharomyces cerevisiae

FAS Fatty Acid Synthase (2uv8)

33


Why do proteins have domains

Mix and match of domains, as an evolutionary process,

has given rise to the great diversity of proteins we see

today.

Large chain folding is more likely to introduce

incorrectly folded regions.

More energetically favorable.


Why predict domains

Sequence alignments at a domain level can detect

homologous sequences otherwise hard to find.

Secondary structure prediction work better when

applied to single domains.

Can give insight into protein function.

Truncating at domain borders can help expression/

solubility/crystallization.

Dividing large proteins into domains may be necessary

to solve structure by X-ray crystallography or NMR.


Domain prediction - Theoretical approach

Soybean

Lipoxygenase-1

MFSAGHKIKGTVVLMPKNELEVNPDGSAVDNLNAFLGRSVSLQLISATKADAHGKGKVGKDTFLEGINT

SLPTLGAGESAFNIHFEWDGSMGIPGAFYIKNYMQVEFFLKSLTLEAISNQGTIRFVCNSWVYNTKLYKS

VRIFFANHTYVPSETPAPLVSYREEELKSLRGNGTGERKEYDRIYDYDVYNDLGNPDKSEKLARPVLGG

SSTFPYPRRGRTGRGPTVTDPNTEKQGEVFYVPRDENLGHLKSKDALEIGTKSLSQIVQPAFESAFDLK

STPIEFHSFQDVHDLYEGGIKLPRDVISTIIPLPVIKELYRTDGQHILKFPQPHVVQVSQSAWMTDEEFARE

MIAGVNPCVIRGLEEFPPKSNLDPAIYGDQSSKITADSLDLDGYTMDEALGSRRLFMLDYHDIFMPYVRQI

NQLNSAKTYATRTILFLREDGTLKPVAIELSLPHSAGDLSAAVSQVVLPAKEGVESTIWLLAKAYVIVNDSC

YHQLMSHWLNTHAAMEPFVIATHRHLSVLHPIYKLLTPHYRNNMNINALARQSLINANGIIETTFLPSKYS

VEMSSAVYKNWVFTDQALPADLIKRGVAIKDPSTPHGVRLLIEDYPYAADGLEIWAAIKTWVQEYVPLYYA

RDDDVKNDSELQHWWKEAVEKGHGDLKDKPWWPKLQTLEDLVEVCLIIIWIASALHAAVNFGQYPYGG

LIMNRPTASRRLLPEKGTPEYEEMINNHEKAYLRTITSKLPTLISLSVIEILSTHASDEVYLGQRDNPHWTS

DSKALQAFQKFGNKLKEIEEKLVRRNNDPSLQGNRLGPVQLPYTLLYPSSEEGLTFRGIPNSISI


Pfam - Protein families database

Bateman A, Coin L et. al.

The Pfam protein families

database.

Nucleic Acids Res. 2004 Jan 1;32

(Database issue):D138-41.

http://www.sanger.ac.uk/software/pfam/


DomPred - Protein Domain Prediction Server

Marsden, McGuffin & Jones

Rapid protein domain assignment

from amino acid sequence using

predicted secondary structure.

Protein Science, 11 (2002),

2814-2824.

http://bioinf.cs.ucl.ac.uk/dompred/


Armadillo - Domain Linker Prediction

Dumontier M, Yao R et. al.

Armadillo: domain boundary

prediction by amino acid

composition.

J Mol Biol. 2005 Jul 29;350(5):

1061-73.

http://armadillo.blueprint.org


IUPred - Dissecting proteins into ordered and disordered parts

Dosztanyi Z, Csizmok V et. al.

IUPred: web server for the

prediction of intrinsically

unstructured regions of proteins

based on estimated energy

content.

Bioinformatics. 2005 Aug 15;

21(16):3433-4.

http://iupred.enzim.hu


FoldIndex - Finds unfolded regions in protein sequence

Prilusky J, Felder CE et. al.

FoldIndex: a simple tool to

predict whether a given

protein sequence is

intrinsically unfolded.

Bioinformatics. 2005 Aug 15;

21(16):3435-8.

http://bip.weizmann.ac.il/fldbin/findex


Domain prediction - Experimental approach

Felis domesticus

Pyruvate kinase

SKPHSDVGTAFIQTQQLHAAMADTFLEHMCRLDIDSPPITARNTGIICTIGPASRSVEILKEMI

KSGMNVARLNFSHGTHEYHAETIKNVRAATESFASDPIRYRPVAVALDTKGPEIRTGLIKGS

GTAEVELKKGATLKITLDNAYMEKCDENVLWLDYKNICKVVEVGSKVYVDDGLISLLVKEKG

ADFLVTEVENGGSLGSKKGVNLPGAAVDLPAVSEKDIQDLKFGVEQDVDMVFASFIRKASD

VHEVRKVLGEKGKNIKIISKIENHEGVRRFDEILEASDGIMVARGDLGIEIPAEKVFLAQKMMI

GRCNRAGKPVICATQMLESMIKKPRPTRAEGSDVANAVLDGADCIMLSGETAKGDYPLEAV

RMQHLIAREAEAAMFHRKLFEELVRGSSHSTDLMEAMAMGSVEASYKCLAAALIVLTESG

RSAHQVARYRPRAPIIAVTRNHQTARQAHLYRGIFPVVCKDPVQEAWAEDVDLRVNLAMN

VGKARGFFKHGDVVIVLTGWRPGSGFTNTMRVVPVP


Limited proteolysis

Proteolysis

SDS-PAGE

HPLC-MS

SKPHSDVGTAFIQTQQLHAAMADTFLEHMCRLDIDSPPITARNTGIICTIGPASRSVEIL

KEMIKSGMNVARLNFSHGTHEYHAETIKNVRAATESFASDPIRYRPVAVALDTKG

Gao X, Bain K, Bonanno JB et. al.

High-throughput limited proteolysis/

mass spectrometry for protein domain

elucidation.

J Struct Funct Genomics.

2005;6(2-3):129-34.

PEIRTGLIKGSGTAEVELKKGATLKITLDNAYMEKCDENVLWLDYKNICKVVE

VGSKVYVDDGLISLLVKEKGADFLVTEVENGGSLGSKKGVNLPGAAVDL

ELVRGSSHSTDLMEAMAMGSVEASYKCLAAALIVLTESGRSAHQVARYRPRAPIIAVTRNHQTARQAHLY

RGIFPVVCKDPVQEAWAEDVDLRVNLAMNVGKARGFFKHGDVVIVLTGWRPGSGFTNTMRVVPVP

PAVSEKDIQDLKFGVEQDVDMVFASFIRKASDVHEVRKVLGEKGKNIKIISKIENHEGVRRFDEILEASDGIMVARGDLGIEIPAEKVFLAQ

KMMIGRCNRAGKPVICATQMLESMIKKPRPTRAEGSDVANAVLDGADCIMLSGETAKGDYPLEAVRMQHLIAREAEAAMFHRKLFE


GFP-fusion

Nuclease treatment

Culture plate

Expression vectors

Hart DJ, Tarendeau F.

Combinatorial library approaches for

improving soluble protein expression

in Escherichia coli.

Acta Crystallogr D Biol Crystallogr.

2006 Jan;62(Pt 1):19-26.

E.coli bacteria


CoFi-blot

Erase-a-base process


CoFi-blot

Expression vectors

Filters

E.coli bacteria

Colony plate

Cornvik T, Dahlroth SL, Magnusdottir A, Herman MD,

Knaust R, Ekberg M, Nordlund P.

Colony filtration blot: a new screening method for

soluble protein expression in Escherichia coli.

Nat Methods. 2005 Jul;2(7):507-9.


Textbook

Petsko & Ringe Protein Structure and Function

1-14 The Protein Domain (p30-31)

1-15 The Universe of Protein Structures (p32-33)

1-17 Alpha Domains and Beta Domains (p36-37)

1-18 Alpha/Beta, Alpha+Beta

and Cross-Linked Domains (p38-39)

3-1 Protein Interaction Domains (p88-89)

More magazines by this user
Similar magazines