Ensemble refinement of protein crystal structures in PHENIX
Ensemble refinement of protein crystal structures in PHENIX
Ensemble refinement of protein crystal structures in PHENIX
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> <strong>of</strong> <strong>prote<strong>in</strong></strong> <strong>crystal</strong> <strong>structures</strong> <strong>in</strong> <strong>PHENIX</strong> <br />
Tom Burnley | Piet Gros
Incomplete modell<strong>in</strong>g <strong>of</strong> disorder contributes to R factor gap <br />
Only ~5% <strong>of</strong> residues <strong>in</strong> the PDB are modelled with more than <br />
one conforma=on (x-‐ray <strong>structures</strong>) <br />
Mul=ple discrete models restricted due to <strong>in</strong>crease <strong>in</strong> number <strong>of</strong> <br />
model parameters <br />
Vitkup et al., (2002) Lang et al., (2010)
Incomplete modell<strong>in</strong>g <strong>of</strong> disorder contributes to R factor gap <br />
Only ~5% <strong>of</strong> residues <strong>in</strong> the PDB are modelled with more than <br />
one conforma=on (x-‐ray <strong>structures</strong>) <br />
Mul=ple discrete models restricted due to <strong>in</strong>crease <strong>in</strong> number <strong>of</strong> <br />
model parameters <br />
Molecular dynamics simula=ons produce a Boltzmann-‐weighted <br />
popula=on <strong>of</strong> <strong>in</strong>ter-‐related <strong>structures</strong> <br />
MD simula=ons can be restra<strong>in</strong>ed with x-‐ray data
Simulated Anneal<strong>in</strong>g/MD <strong>ref<strong>in</strong>ement</strong> <br />
Geometric restra<strong>in</strong>ts: <br />
E bond <br />
E angle <br />
E dih <br />
… <br />
X-‐ray restra<strong>in</strong>t: <br />
E = ∑ X −ray<br />
whkl<br />
(| Fobs(<br />
hkl) | −k<br />
| Fcalc(<br />
hkl)<br />
|)<br />
hkl<br />
2<br />
Simula,on temperature >2000K <br />
Trajectory resolves local m<strong>in</strong>ima <br />
‘F<strong>in</strong>al model’ = end structure <br />
Brunger et al., 1987
“Time-‐averaged” MD <strong>ref<strong>in</strong>ement</strong> <br />
Geometric restra<strong>in</strong>ts: <br />
E bond <br />
E angle <br />
E dih <br />
… <br />
X-‐ray restra<strong>in</strong>t: <br />
∑<br />
E X−ray<br />
= w hkl<br />
(| F obs<br />
(hkl) | −k | F calc<br />
(hkl) |) 2<br />
hkl<br />
Sampl<strong>in</strong>g <br />
F calc t<br />
= (1− e −Δt/τ x )F t calc<br />
+ e −Δt/τ x<br />
F calc t−Δt<br />
Gros et al., 1990
“Time-‐averaged” MD <strong>ref<strong>in</strong>ement</strong> <br />
Geometric restra<strong>in</strong>ts: <br />
E bond <br />
E angle <br />
E dih <br />
… <br />
X-‐ray restra<strong>in</strong>t: <br />
∑<br />
E X−ray<br />
= w hkl<br />
(| F obs<br />
(hkl) | −k | F calc<br />
(hkl) |) 2<br />
hkl<br />
Runn<strong>in</strong>g average <br />
Simula,on temperature = 300K <br />
‘F<strong>in</strong>al model’ = trajectory ensemble <br />
F calc t<br />
= (1− e −Δt/τ x )F t calc<br />
+ e −Δt/τ x<br />
F calc t−Δt
<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> with TA restra<strong>in</strong>ts <br />
(1) (2) <br />
(1) Start: ‘Tradi.onal’ structure <br />
(2) Fit TLS / remove alt. conf.
<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> with TA restra<strong>in</strong>ts <br />
(1) (2) (3) <br />
(1) Start: ‘Tradi.onal’ structure <br />
(2) Fit TLS / remove alt. conf. <br />
(3) TA restra<strong>in</strong>ed MD simula.on <br />
Collect structure / 0.04ps <br />
X-‐ray restra<strong>in</strong>ts accelerate sampl<strong>in</strong>g
<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> with TA restra<strong>in</strong>ts <br />
(1) (2) (3) <br />
(1) Start: ‘Tradi.onal’ structure <br />
(2) Fit TLS / remove alt. conf. <br />
(3) TA restra<strong>in</strong>ed MD simula.on <br />
(4) F<strong>in</strong>al ensemble <br />
(4)
Molecular disorder / la@ce disorder <br />
Prote<strong>in</strong> with <br />
local disorder <br />
Perfect <strong>crystal</strong> <br />
laOce
Molecular disorder / la@ce disorder <br />
Prote<strong>in</strong> with <br />
local disorder <br />
Perfect <strong>crystal</strong> <br />
laOce <br />
'Real' <strong>crystal</strong> <br />
laOce <br />
Averaged <br />
data
Molecular disorder / la@ce disorder <br />
Prote<strong>in</strong> with <br />
local disorder <br />
Perfect <strong>crystal</strong> <br />
laOce <br />
'Real' <strong>crystal</strong> <br />
laOce <br />
Averaged <br />
data <br />
Deconvolute: <br />
molecular disorder from <br />
laOce disorder
Rigid body disorder modelled with TLS <br />
●<br />
●<br />
LaOce distor.ons, doma<strong>in</strong> mo.ons... <br />
Model rigid body mo=ons with TLS <br />
model
Rigid body disorder modelled with TLS <br />
Extract core rigid body mo=on by exclud<strong>in</strong>g atoms with large local fluctua=ons <br />
(def<strong>in</strong>ed as devia=ons from B TLS ).
Local disorder sampled with<strong>in</strong> restra<strong>in</strong>ed MD <br />
●<br />
LaOce distor.ons, doma<strong>in</strong> mo.ons... <br />
●<br />
Side-‐cha<strong>in</strong>s, loops... <br />
●<br />
Model rigid body mo=ons with TLS <br />
model <br />
– 20 parameters per group <br />
●<br />
●<br />
Local disorder sampled <strong>in</strong> MD <br />
MD simula=on restra<strong>in</strong>ed with X-‐ray <br />
data<br />
– 1 group / doma<strong>in</strong> <br />
●<br />
Fit TLS to star=ng structure B-‐factors
Composite B-‐factor sum <strong>of</strong> TLS and atomic fluctuaJons <br />
●<br />
●<br />
●<br />
TLS <br />
– Core TLS model <br />
RMSF <br />
– Atomic fluctua=ons <strong>in</strong> ensemble <br />
TLS+RMSF <br />
– Total disorder <strong>in</strong> ensemble <br />
TLS <br />
RMSF <br />
TLS+RMSF <br />
Start <br />
●<br />
Start <br />
– Input s<strong>in</strong>gle structure ADPs
Dual explicit-‐bulk solvent model <br />
-‐ Explicit solvent <br />
-‐ Model with explicit atoms <br />
-‐ Water picked every 250 steps <br />
“standard” rules: <br />
> 3 σ <strong>in</strong> difference map <br />
-‐<br />
< 3 Å distances <br />
B-‐factor from nearest TLS group
Dual explicit-‐bulk solvent model <br />
-‐ Explicit solvent <br />
-‐ Model with explicit atoms <br />
-‐ Water picked every 250 steps <br />
“standard” rules: <br />
> 3 σ <strong>in</strong> difference map <br />
-‐<br />
< 3 Å distances <br />
B-‐factor from nearest TLS group <br />
-‐ Bulk solvent <br />
-‐ Model with ‘density mask’ <br />
F mask t<br />
= (1− e −Δt/τ x )F t mask<br />
+ e −Δt/τ x<br />
F mask t−Δt
Development <strong>of</strong> ensemble <strong>ref<strong>in</strong>ement</strong> <br />
• <strong>PHENIX</strong> <br />
• Time-‐averaged x-‐ray restra<strong>in</strong>ed MD <br />
• TLS fiOng <br />
• Explicit-‐ and bulk-‐ solvent model <br />
• Maximum-‐likelihood target func.on
Development <strong>of</strong> ensemble <strong>ref<strong>in</strong>ement</strong> <br />
• Tested with 20 datasets <br />
• Resolu.on: 1 -‐ 3 Å <br />
• ASU size: 50-‐1000 residues <br />
• CPU .me: 7 -‐ 100 hours <br />
• 50 – 500 models / ensemble
<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> reduces Rfree <br />
• Rfree: ensemble vs phenix.ref<strong>in</strong>e<br />
• Rfree reduced <strong>in</strong> all cases <br />
-‐ 4.9% (max) <br />
-‐ 0.3% (m<strong>in</strong>) <br />
-‐ 1.8% (mean) <br />
• Rf/Rw ra=o (mean): <br />
= 1.23 phenix.ref<strong>in</strong>e <br />
= 1.25 ensemble
<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> reduces Rfree <br />
• Rfree: ensemble vs phenix.ref<strong>in</strong>e<br />
• Rfree reduced <strong>in</strong> all cases <br />
-‐ 4.9% (max) <br />
-‐ 0.3% (m<strong>in</strong>) <br />
-‐ 1.8% (mean) <br />
• Rf/Rw ra=o (mean): <br />
= 1.23 phenix.ref<strong>in</strong>e <br />
= 1.25 ensemble <br />
• Rfree: ensemble<br />
vs PDB_REDO<br />
-‐ 5.1% (max) <br />
+ 0.6% (m<strong>in</strong>) <br />
-‐ 2.1% (mean) <br />
Joosten et. al (2011)
B factor <br />
5 -‐ 20Ų <br />
1uoy.pdb | 188 ensemble | 40ps acquisi.on .me | 1 tls group
B factor <br />
5 -‐ 20Ų <br />
1uoy.pdb | 188 ensemble | 40ps acquisi.on .me | 1 tls group
1uoy.pdb | phenix.ref<strong>in</strong>e | 1 tls group | mFo-‐DFc ±0.49 e/ų (3.00 σ)
1uoy.pdb | 188 ensemble | 1 tls group | mFo-‐DFc ±0.49 e/ų (4.27 σ)
Improved real-‐space correla=on <br />
Burl<strong>in</strong>g et al. (1996): <br />
Excellent experimentally phased data for MBP: 1YTT (1.8-‐Å res.) <br />
<strong>Ensemble</strong> <br />
Phenix.ref<strong>in</strong>e <br />
B-‐factor
Disordered side cha<strong>in</strong> <strong>in</strong> MBP (1YTT) <br />
Gln167 (A) <br />
Glu117 (A) <br />
Phenix.ref<strong>in</strong>e <br />
Diff. vector map (3σ) <br />
<strong>Ensemble</strong> <br />
Diff. vector map (3σ)
Anisotropic side cha<strong>in</strong> <strong>in</strong> MBP (1YTT) <br />
Burl<strong>in</strong>g et al. (1996) <br />
Phe121 (A) <br />
Experimental map (1.4σ) <br />
Mul.-‐conformer <br />
(Rfree 20.3%) <br />
<strong>Ensemble</strong> <br />
(Rfree 17.4%)
Diffuse solvent <strong>in</strong> MBP (1YTT) <br />
Burl<strong>in</strong>g et al. (1996) <br />
Published: <br />
Experimental map <br />
(1.4σ and 0.7σ)
Diffuse solvent <strong>in</strong> MBP (1YTT) <br />
Burl<strong>in</strong>g et al. (1996) <br />
Published: <br />
Experimental map <br />
(1.4σ and 0.7σ) <br />
<strong>Ensemble</strong>: <br />
Experimental map <br />
(1.4 σ) <br />
(0.7 σ)
Explicit waters <strong>in</strong> MBP (1YTT)
1IEP (2.1-‐Å res.) <br />
How similar are NCS copies?
NCS copies show similar distribu=ons <br />
1IEP (2.1-‐Å res.) <br />
-‐ Global TLS model accounts for differences <strong>in</strong> pack<strong>in</strong>g <strong>of</strong> copies <br />
-‐ Local fluctua=ons are similar between the two copies
1M52 (2.6-‐Å res.)
2XFA (2.1-‐Å res.)
Isomorphous <strong>crystal</strong>s at 100 and 288 K <br />
3K0N & 3K0M: Prol<strong>in</strong>e isomerase (Cyclophil<strong>in</strong> A) <br />
Fraser et al. (2009)
Mul=-‐conformers <strong>in</strong> ac=ve site <br />
3K0M (1.3-‐Å res.) <br />
100 K <br />
Leu98 <br />
Ser99 <br />
Phe113 <br />
3K0N (1.4-‐Å res.) <br />
288 K <br />
Leu98 <br />
<strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong>: <br />
1.3% ga<strong>in</strong> Rfree <br />
A:B = 2:1 <br />
Ser99 <br />
Phe113 <br />
Fraser et al. (2009), Eisenmesser et al. (2005) <br />
B-‐factor <br />
5-‐25Ų <br />
2.6% ga<strong>in</strong> Rfree
Occupancies agree with NMR data <br />
• NMR relaxa=on dispersion (283 K) <br />
L98, S99, F113 <br />
M<strong>in</strong>or popula=on ~10% <br />
S99 <br />
L98 <br />
3K0N & 3K0M: Prol<strong>in</strong>e isomerase <br />
Fraser et al. (2009), Eisenmesser et al. (2005) <br />
F113 <br />
<strong>Ensemble</strong> (288 K) <br />
B-‐factor <br />
5-‐25Ų
Ima=nib-‐ABL Tyros<strong>in</strong>e K<strong>in</strong>ase (1IEP) <br />
Nagar et al. (2002)
Ima=nib-‐ABL Tyros<strong>in</strong>e K<strong>in</strong>ase (1IEP)
Ima=nib-‐ABL Tyros<strong>in</strong>e K<strong>in</strong>ase (1IEP) <br />
Nagar et al (2002), Johnson (2009)
Inhibitor b<strong>in</strong>d<strong>in</strong>g to HIV protease <br />
apo-‐protease (2PC0) <br />
complex with JE-‐2147 (1KZK) <br />
Compara=ve analysis <strong>of</strong> atomic flexibility: <br />
-‐ Isomorphous <strong>crystal</strong>s elim<strong>in</strong>ate differences due to Xtal contacts <br />
-‐ Non-‐isomorphous <strong>crystal</strong>s allow evalua=on <strong>of</strong> Xtal contact <br />
effects
Conclusions <br />
• Global disorder modeled by TLS and local disorder <br />
by MD <br />
• <strong>Ensemble</strong> <strong>ref<strong>in</strong>ement</strong> improves Rfree and electron <br />
density maps <br />
• Suitable for a broad resolu=on range (1Å – 3Å) <br />
• NCS copies show very similar fluctua=ons <br />
• Clear representa=on <strong>of</strong> local disorder / uncerta<strong>in</strong>ty <br />
• Distribu=on <strong>of</strong> atom posi=ons allows further <br />
structural analysis <br />
• Resolve the f<strong>in</strong>er details <strong>of</strong> <strong>prote<strong>in</strong></strong> structure(s)
Acknowledgements <br />
Piet Gros <br />
Gros Laboratory <br />
Pavel Afon<strong>in</strong>e, Paul Adams <br />
<strong>PHENIX</strong> Developers <br />
Fund<strong>in</strong>g: Utrecht University / NWO
Number <strong>structures</strong> <strong>in</strong> ensemble vs resoluJon
Number <strong>of</strong> <strong>structures</strong> <strong>in</strong> the ensemble <br />
Number <strong>of</strong> <br />
<strong>structures</strong> <br />
Averag<strong>in</strong>g <br />
=me τ x <br />
Structures taken equidistant <strong>in</strong> =me to <br />
reproduce the Rwork with<strong>in</strong> 0.1%
R-‐factor by resolu=on shell <br />
1UOY
Par=al ligand/ion b<strong>in</strong>d<strong>in</strong>g <br />
B-‐factor <strong>of</strong> Cd 2+ -‐ion more <br />
than twice its surround<strong>in</strong>g <br />
Simula=on run at Q=1 for Cd 2+ -‐ion <br />
(atoms coloured by k<strong>in</strong>e=c energy) <br />
-‐ High B-‐factor ligand/ion may <strong>in</strong>dicate par=al occupancy
Par=al ligand/ion b<strong>in</strong>d<strong>in</strong>g <br />
Cd 2+ : Q=1 <br />
Cd 2+ : Q=0.7 <br />
-‐ Heat map (k<strong>in</strong>e=c energy) is valida=on tool for <strong>Ensemble</strong> <br />
Ref<strong>in</strong>ement
<strong>Ensemble</strong> <br />
<strong>Ensemble</strong> <br />
S<strong>in</strong>gle-‐structure
Geometric valida=on <br />
Bonds Angles Dih. angles <br />
S<strong>in</strong>gle structure 0.010 Å 1.23° 15.1° <br />
Ens: √〈(x ideal -‐〈x model 〉) 2 〉 -‐0.002 -‐0.28 -‐6.5 <br />
Ens: √〈(x ideal -‐x model ) 2 〉 +0.002 +0.30 +4.0 <br />
(sta=s=cs for all 20 cases)
Ramachandran analysis <br />
1UOY ( 1.5-‐Å res.) <br />
Outliers <br />
Allowed <br />
Favoured
Outliers occur more <strong>in</strong> flexible regions <br />
X-‐ray ensemble <br />
1BV1 (2.0-‐Å res.) <br />
NMR ensemble
Ramachandran devia=ons by resolu=on <br />
S<strong>in</strong>gle structure <br />
<strong>Ensemble</strong> modal <br />
<strong>Ensemble</strong> mean <br />
“Outliers” <br />
“Allowed” <br />
-‐ More outliers are observed at lower resolu=on <br />
-‐ Geometric quality correlates with Rfree <br />
-‐ “Best” run selected by Rfree
Ramachandran devia=ons by resolu=on <br />
S<strong>in</strong>gle structure <br />
<strong>Ensemble</strong> modal <br />
<strong>Ensemble</strong> mean <br />
“Outliers” <br />
“Allowed” <br />
-‐ More outliers are observed at lower resolu=on <br />
-‐ Geometric quality correlates with Rfree <br />
-‐ “Best” run selected by Rfree