21.05.2014 Views

Ensemble refinement of protein crystal structures in PHENIX

Ensemble refinement of protein crystal structures in PHENIX

Ensemble refinement of protein crystal structures in PHENIX

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> <strong>of</strong> <strong>prote<strong>in</strong></strong> <strong>crystal</strong> <strong>structures</strong> <strong>in</strong> <strong>PHENIX</strong> <br />

Tom Burnley | Piet Gros


Incomplete modell<strong>in</strong>g <strong>of</strong> disorder contributes to R factor gap <br />

Only ~5% <strong>of</strong> residues <strong>in</strong> the PDB are modelled with more than <br />

one conforma=on (x-­‐ray <strong>structures</strong>) <br />

Mul=ple discrete models restricted due to <strong>in</strong>crease <strong>in</strong> number <strong>of</strong> <br />

model parameters <br />

Vitkup et al., (2002) Lang et al., (2010)


Incomplete modell<strong>in</strong>g <strong>of</strong> disorder contributes to R factor gap <br />

Only ~5% <strong>of</strong> residues <strong>in</strong> the PDB are modelled with more than <br />

one conforma=on (x-­‐ray <strong>structures</strong>) <br />

Mul=ple discrete models restricted due to <strong>in</strong>crease <strong>in</strong> number <strong>of</strong> <br />

model parameters <br />

Molecular dynamics simula=ons produce a Boltzmann-­‐weighted <br />

popula=on <strong>of</strong> <strong>in</strong>ter-­‐related <strong>structures</strong> <br />

MD simula=ons can be restra<strong>in</strong>ed with x-­‐ray data


Simulated Anneal<strong>in</strong>g/MD <strong>ref<strong>in</strong>ement</strong> <br />

Geometric restra<strong>in</strong>ts: <br />

E bond <br />

E angle <br />

E dih <br />

… <br />

X-­‐ray restra<strong>in</strong>t: <br />

E = ∑ X −ray<br />

whkl<br />

(| Fobs(<br />

hkl) | −k<br />

| Fcalc(<br />

hkl)<br />

|)<br />

hkl<br />

2<br />

Simula,on temperature >2000K <br />

Trajectory resolves local m<strong>in</strong>ima <br />

‘F<strong>in</strong>al model’ = end structure <br />

Brunger et al., 1987


“Time-­‐averaged” MD <strong>ref<strong>in</strong>ement</strong> <br />

Geometric restra<strong>in</strong>ts: <br />

E bond <br />

E angle <br />

E dih <br />

… <br />

X-­‐ray restra<strong>in</strong>t: <br />

∑<br />

E X−ray<br />

= w hkl<br />

(| F obs<br />

(hkl) | −k | F calc<br />

(hkl) |) 2<br />

hkl<br />

Sampl<strong>in</strong>g <br />

F calc t<br />

= (1− e −Δt/τ x )F t calc<br />

+ e −Δt/τ x<br />

F calc t−Δt<br />

Gros et al., 1990


“Time-­‐averaged” MD <strong>ref<strong>in</strong>ement</strong> <br />

Geometric restra<strong>in</strong>ts: <br />

E bond <br />

E angle <br />

E dih <br />

… <br />

X-­‐ray restra<strong>in</strong>t: <br />

∑<br />

E X−ray<br />

= w hkl<br />

(| F obs<br />

(hkl) | −k | F calc<br />

(hkl) |) 2<br />

hkl<br />

Runn<strong>in</strong>g average <br />

Simula,on temperature = 300K <br />

‘F<strong>in</strong>al model’ = trajectory ensemble <br />

F calc t<br />

= (1− e −Δt/τ x )F t calc<br />

+ e −Δt/τ x<br />

F calc t−Δt


<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> with TA restra<strong>in</strong>ts <br />

(1) (2) <br />

(1) Start: ‘Tradi.onal’ structure <br />

(2) Fit TLS / remove alt. conf.


<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> with TA restra<strong>in</strong>ts <br />

(1) (2) (3) <br />

(1) Start: ‘Tradi.onal’ structure <br />

(2) Fit TLS / remove alt. conf. <br />

(3) TA restra<strong>in</strong>ed MD simula.on <br />

Collect structure / 0.04ps <br />

X-­‐ray restra<strong>in</strong>ts accelerate sampl<strong>in</strong>g


<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> with TA restra<strong>in</strong>ts <br />

(1) (2) (3) <br />

(1) Start: ‘Tradi.onal’ structure <br />

(2) Fit TLS / remove alt. conf. <br />

(3) TA restra<strong>in</strong>ed MD simula.on <br />

(4) F<strong>in</strong>al ensemble <br />

(4)


Molecular disorder / la@ce disorder <br />

Prote<strong>in</strong> with <br />

local disorder <br />

Perfect <strong>crystal</strong> <br />

laOce


Molecular disorder / la@ce disorder <br />

Prote<strong>in</strong> with <br />

local disorder <br />

Perfect <strong>crystal</strong> <br />

laOce <br />

'Real' <strong>crystal</strong> <br />

laOce <br />

Averaged <br />

data


Molecular disorder / la@ce disorder <br />

Prote<strong>in</strong> with <br />

local disorder <br />

Perfect <strong>crystal</strong> <br />

laOce <br />

'Real' <strong>crystal</strong> <br />

laOce <br />

Averaged <br />

data <br />

Deconvolute: <br />

molecular disorder from <br />

laOce disorder


Rigid body disorder modelled with TLS <br />

●<br />

●<br />

LaOce distor.ons, doma<strong>in</strong> mo.ons... <br />

Model rigid body mo=ons with TLS <br />

model


Rigid body disorder modelled with TLS <br />

Extract core rigid body mo=on by exclud<strong>in</strong>g atoms with large local fluctua=ons <br />

(def<strong>in</strong>ed as devia=ons from B TLS ).


Local disorder sampled with<strong>in</strong> restra<strong>in</strong>ed MD <br />

●<br />

LaOce distor.ons, doma<strong>in</strong> mo.ons... <br />

●<br />

Side-­‐cha<strong>in</strong>s, loops... <br />

●<br />

Model rigid body mo=ons with TLS <br />

model <br />

– 20 parameters per group <br />

●<br />

●<br />

Local disorder sampled <strong>in</strong> MD <br />

MD simula=on restra<strong>in</strong>ed with X-­‐ray <br />

data<br />

– 1 group / doma<strong>in</strong> <br />

●<br />

Fit TLS to star=ng structure B-­‐factors


Composite B-­‐factor sum <strong>of</strong> TLS and atomic fluctuaJons <br />

●<br />

●<br />

●<br />

TLS <br />

– Core TLS model <br />

RMSF <br />

– Atomic fluctua=ons <strong>in</strong> ensemble <br />

TLS+RMSF <br />

– Total disorder <strong>in</strong> ensemble <br />

TLS <br />

RMSF <br />

TLS+RMSF <br />

Start <br />

●<br />

Start <br />

– Input s<strong>in</strong>gle structure ADPs


Dual explicit-­‐bulk solvent model <br />

-­‐ Explicit solvent <br />

-­‐ Model with explicit atoms <br />

-­‐ Water picked every 250 steps <br />

“standard” rules: <br />

> 3 σ <strong>in</strong> difference map <br />

-­‐<br />

< 3 Å distances <br />

B-­‐factor from nearest TLS group


Dual explicit-­‐bulk solvent model <br />

-­‐ Explicit solvent <br />

-­‐ Model with explicit atoms <br />

-­‐ Water picked every 250 steps <br />

“standard” rules: <br />

> 3 σ <strong>in</strong> difference map <br />

-­‐<br />

< 3 Å distances <br />

B-­‐factor from nearest TLS group <br />

-­‐ Bulk solvent <br />

-­‐ Model with ‘density mask’ <br />

F mask t<br />

= (1− e −Δt/τ x )F t mask<br />

+ e −Δt/τ x<br />

F mask t−Δt


Development <strong>of</strong> ensemble <strong>ref<strong>in</strong>ement</strong> <br />

• <strong>PHENIX</strong> <br />

• Time-­‐averaged x-­‐ray restra<strong>in</strong>ed MD <br />

• TLS fiOng <br />

• Explicit-­‐ and bulk-­‐ solvent model <br />

• Maximum-­‐likelihood target func.on


Development <strong>of</strong> ensemble <strong>ref<strong>in</strong>ement</strong> <br />

• Tested with 20 datasets <br />

• Resolu.on: 1 -­‐ 3 Å <br />

• ASU size: 50-­‐1000 residues <br />

• CPU .me: 7 -­‐ 100 hours <br />

• 50 – 500 models / ensemble


<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> reduces Rfree <br />

• Rfree: ensemble vs phenix.ref<strong>in</strong>e<br />

• Rfree reduced <strong>in</strong> all cases <br />

-­‐ 4.9% (max) <br />

-­‐ 0.3% (m<strong>in</strong>) <br />

-­‐ 1.8% (mean) <br />

• Rf/Rw ra=o (mean): <br />

= 1.23 phenix.ref<strong>in</strong>e <br />

= 1.25 ensemble


<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> reduces Rfree <br />

• Rfree: ensemble vs phenix.ref<strong>in</strong>e<br />

• Rfree reduced <strong>in</strong> all cases <br />

-­‐ 4.9% (max) <br />

-­‐ 0.3% (m<strong>in</strong>) <br />

-­‐ 1.8% (mean) <br />

• Rf/Rw ra=o (mean): <br />

= 1.23 phenix.ref<strong>in</strong>e <br />

= 1.25 ensemble <br />

• Rfree: ensemble<br />

vs PDB_REDO<br />

-­‐ 5.1% (max) <br />

+ 0.6% (m<strong>in</strong>) <br />

-­‐ 2.1% (mean) <br />

Joosten et. al (2011)


B factor <br />

5 -­‐ 20Ų <br />

1uoy.pdb | 188 ensemble | 40ps acquisi.on .me | 1 tls group


B factor <br />

5 -­‐ 20Ų <br />

1uoy.pdb | 188 ensemble | 40ps acquisi.on .me | 1 tls group


1uoy.pdb | phenix.ref<strong>in</strong>e | 1 tls group | mFo-­‐DFc ±0.49 e/ų (3.00 σ)


1uoy.pdb | 188 ensemble | 1 tls group | mFo-­‐DFc ±0.49 e/ų (4.27 σ)


Improved real-­‐space correla=on <br />

Burl<strong>in</strong>g et al. (1996): <br />

Excellent experimentally phased data for MBP: 1YTT (1.8-­‐Å res.) <br />

<strong>Ensemble</strong> <br />

Phenix.ref<strong>in</strong>e <br />

B-­‐factor


Disordered side cha<strong>in</strong> <strong>in</strong> MBP (1YTT) <br />

Gln167 (A) <br />

Glu117 (A) <br />

Phenix.ref<strong>in</strong>e <br />

Diff. vector map (3σ) <br />

<strong>Ensemble</strong> <br />

Diff. vector map (3σ)


Anisotropic side cha<strong>in</strong> <strong>in</strong> MBP (1YTT) <br />

Burl<strong>in</strong>g et al. (1996) <br />

Phe121 (A) <br />

Experimental map (1.4σ) <br />

Mul.-­‐conformer <br />

(Rfree 20.3%) <br />

<strong>Ensemble</strong> <br />

(Rfree 17.4%)


Diffuse solvent <strong>in</strong> MBP (1YTT) <br />

Burl<strong>in</strong>g et al. (1996) <br />

Published: <br />

Experimental map <br />

(1.4σ and 0.7σ)


Diffuse solvent <strong>in</strong> MBP (1YTT) <br />

Burl<strong>in</strong>g et al. (1996) <br />

Published: <br />

Experimental map <br />

(1.4σ and 0.7σ) <br />

<strong>Ensemble</strong>: <br />

Experimental map <br />

(1.4 σ) <br />

(0.7 σ)


Explicit waters <strong>in</strong> MBP (1YTT)


1IEP (2.1-­‐Å res.) <br />

How similar are NCS copies?


NCS copies show similar distribu=ons <br />

1IEP (2.1-­‐Å res.) <br />

-­‐ Global TLS model accounts for differences <strong>in</strong> pack<strong>in</strong>g <strong>of</strong> copies <br />

-­‐ Local fluctua=ons are similar between the two copies


1M52 (2.6-­‐Å res.)


2XFA (2.1-­‐Å res.)


Isomorphous <strong>crystal</strong>s at 100 and 288 K <br />

3K0N & 3K0M: Prol<strong>in</strong>e isomerase (Cyclophil<strong>in</strong> A) <br />

Fraser et al. (2009)


Mul=-­‐conformers <strong>in</strong> ac=ve site <br />

3K0M (1.3-­‐Å res.) <br />

100 K <br />

Leu98 <br />

Ser99 <br />

Phe113 <br />

3K0N (1.4-­‐Å res.) <br />

288 K <br />

Leu98 <br />

<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong>: <br />

1.3% ga<strong>in</strong> Rfree <br />

A:B = 2:1 <br />

Ser99 <br />

Phe113 <br />

Fraser et al. (2009), Eisenmesser et al. (2005) <br />

B-­‐factor <br />

5-­‐25Ų <br />

2.6% ga<strong>in</strong> Rfree


Occupancies agree with NMR data <br />

• NMR relaxa=on dispersion (283 K) <br />

L98, S99, F113 <br />

M<strong>in</strong>or popula=on ~10% <br />

S99 <br />

L98 <br />

3K0N & 3K0M: Prol<strong>in</strong>e isomerase <br />

Fraser et al. (2009), Eisenmesser et al. (2005) <br />

F113 <br />

<strong>Ensemble</strong> (288 K) <br />

B-­‐factor <br />

5-­‐25Ų


Ima=nib-­‐ABL Tyros<strong>in</strong>e K<strong>in</strong>ase (1IEP) <br />

Nagar et al. (2002)


Ima=nib-­‐ABL Tyros<strong>in</strong>e K<strong>in</strong>ase (1IEP)


Ima=nib-­‐ABL Tyros<strong>in</strong>e K<strong>in</strong>ase (1IEP) <br />

Nagar et al (2002), Johnson (2009)


Inhibitor b<strong>in</strong>d<strong>in</strong>g to HIV protease <br />

apo-­‐protease (2PC0) <br />

complex with JE-­‐2147 (1KZK) <br />

Compara=ve analysis <strong>of</strong> atomic flexibility: <br />

-­‐ Isomorphous <strong>crystal</strong>s elim<strong>in</strong>ate differences due to Xtal contacts <br />

-­‐ Non-­‐isomorphous <strong>crystal</strong>s allow evalua=on <strong>of</strong> Xtal contact <br />

effects


Conclusions <br />

• Global disorder modeled by TLS and local disorder <br />

by MD <br />

• <strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> improves Rfree and electron <br />

density maps <br />

• Suitable for a broad resolu=on range (1Å – 3Å) <br />

• NCS copies show very similar fluctua=ons <br />

• Clear representa=on <strong>of</strong> local disorder / uncerta<strong>in</strong>ty <br />

• Distribu=on <strong>of</strong> atom posi=ons allows further <br />

structural analysis <br />

• Resolve the f<strong>in</strong>er details <strong>of</strong> <strong>prote<strong>in</strong></strong> structure(s)


Acknowledgements <br />

Piet Gros <br />

Gros Laboratory <br />

Pavel Afon<strong>in</strong>e, Paul Adams <br />

<strong>PHENIX</strong> Developers <br />

Fund<strong>in</strong>g: Utrecht University / NWO


Number <strong>structures</strong> <strong>in</strong> ensemble vs resoluJon


Number <strong>of</strong> <strong>structures</strong> <strong>in</strong> the ensemble <br />

Number <strong>of</strong> <br />

<strong>structures</strong> <br />

Averag<strong>in</strong>g <br />

=me τ x <br />

Structures taken equidistant <strong>in</strong> =me to <br />

reproduce the Rwork with<strong>in</strong> 0.1%


R-­‐factor by resolu=on shell <br />

1UOY


Par=al ligand/ion b<strong>in</strong>d<strong>in</strong>g <br />

B-­‐factor <strong>of</strong> Cd 2+ -­‐ion more <br />

than twice its surround<strong>in</strong>g <br />

Simula=on run at Q=1 for Cd 2+ -­‐ion <br />

(atoms coloured by k<strong>in</strong>e=c energy) <br />

-­‐ High B-­‐factor ligand/ion may <strong>in</strong>dicate par=al occupancy


Par=al ligand/ion b<strong>in</strong>d<strong>in</strong>g <br />

Cd 2+ : Q=1 <br />

Cd 2+ : Q=0.7 <br />

-­‐ Heat map (k<strong>in</strong>e=c energy) is valida=on tool for <strong>Ensemble</strong> <br />

Ref<strong>in</strong>ement


<strong>Ensemble</strong> <br />

<strong>Ensemble</strong> <br />

S<strong>in</strong>gle-­‐structure


Geometric valida=on <br />

Bonds Angles Dih. angles <br />

S<strong>in</strong>gle structure 0.010 Å 1.23° 15.1° <br />

Ens: √〈(x ideal -­‐〈x model 〉) 2 〉 -­‐0.002 -­‐0.28 -­‐6.5 <br />

Ens: √〈(x ideal -­‐x model ) 2 〉 +0.002 +0.30 +4.0 <br />

(sta=s=cs for all 20 cases)


Ramachandran analysis <br />

1UOY ( 1.5-­‐Å res.) <br />

Outliers <br />

Allowed <br />

Favoured


Outliers occur more <strong>in</strong> flexible regions <br />

X-­‐ray ensemble <br />

1BV1 (2.0-­‐Å res.) <br />

NMR ensemble


Ramachandran devia=ons by resolu=on <br />

S<strong>in</strong>gle structure <br />

<strong>Ensemble</strong> modal <br />

<strong>Ensemble</strong> mean <br />

“Outliers” <br />

“Allowed” <br />

-­‐ More outliers are observed at lower resolu=on <br />

-­‐ Geometric quality correlates with Rfree <br />

-­‐ “Best” run selected by Rfree


Ramachandran devia=ons by resolu=on <br />

S<strong>in</strong>gle structure <br />

<strong>Ensemble</strong> modal <br />

<strong>Ensemble</strong> mean <br />

“Outliers” <br />

“Allowed” <br />

-­‐ More outliers are observed at lower resolu=on <br />

-­‐ Geometric quality correlates with Rfree <br />

-­‐ “Best” run selected by Rfree

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!