Automated structure refinement with phenix.refine
Automated structure refinement with phenix.refine
Automated structure refinement with phenix.refine
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Automated</strong> <strong>structure</strong> <strong><strong>refine</strong>ment</strong> <strong>with</strong> <strong>phenix</strong>.<strong>refine</strong><br />
Pavel Afonine<br />
Computation Crystallography Initiative<br />
Physical Biosciences Division<br />
Lawrence Berkeley National Laboratory, Berkeley CA, USA<br />
PHENIX workshop<br />
Houston, April 30, 2010
Structure <strong><strong>refine</strong>ment</strong><br />
Crystallographic <strong>structure</strong> determination workflow<br />
Purified<br />
object<br />
Crystals<br />
Experimental<br />
Data<br />
Initial<br />
approximate<br />
model<br />
Model<br />
Re-building<br />
Refinement<br />
Validation<br />
And<br />
analysis<br />
Deposition<br />
(publishing)<br />
Structure <strong><strong>refine</strong>ment</strong>: modify model parameters to describe the<br />
experimental data as good as possible<br />
Initial model<br />
Calculate<br />
model <strong>structure</strong><br />
factors<br />
Correct for bulk<br />
solvent and<br />
other scaling<br />
Modify<br />
model<br />
parameters<br />
Improved model
Structure <strong><strong>refine</strong>ment</strong><br />
1. Model parameters<br />
2. Optimization goal<br />
3. Optimization method
Model parameters<br />
Crystal (unit cell)<br />
Bulk solvent<br />
- Macromolecular crystals contain ~20-80% of solvent (mostly disordered)<br />
Atomic model<br />
- Position (coordinates)<br />
- Mobility (ADP; Atomic Displacement Parameters)<br />
- Disorder (occupancies)
Model parameters<br />
Atomic model parameterization depends on:<br />
- quality of experimental data (resolution, completeness, …)<br />
- quality of current model (initial <strong>with</strong> large errors, almost final, …)<br />
- data-to-parameters ratio (restraints have to be accounted for)
Individual<br />
atoms<br />
Model parameterization: coordinates<br />
Constrained rigid bodies (torsion<br />
angle parameterization)<br />
Rigid body<br />
3 * Natoms<br />
3 * Natoms / (7 …10)<br />
6 * Ngroups<br />
High Resolution Low<br />
Some a priori information may be needed such as:<br />
- Stereochemistry restraints<br />
- NCS restraints or constraints
Model parameterization: Atomic Displacement Parameters (“B-factors”)<br />
Nesting of atomic displacements<br />
atom<br />
residue<br />
domain<br />
molecule<br />
crystal<br />
Currently one can split and distinguish only a limited number of movements<br />
Δr total = Δr crystal +Δr domain + Δr atom<br />
Total movement (U TOTAL )<br />
Collective movement (U TLS )<br />
Local movement of<br />
independent atoms (U ATOM )<br />
=<br />
+
Structure <strong><strong>refine</strong>ment</strong><br />
1. Model parameters<br />
2. Optimization goal<br />
3. Optimization method
Target function<br />
Least-Squares target<br />
E DATA<br />
=<br />
∑<br />
s<br />
Maximum-Likelihood target<br />
⎛<br />
E DATA € = ∑ (1− K cs s<br />
) ⎜ − α s<br />
⎜<br />
s<br />
⎝<br />
Real space target<br />
2 F s<br />
MODEL<br />
w ( s<br />
F OBS MODEL<br />
s<br />
− kF ) 2<br />
s<br />
( ) 2<br />
⎛ ⎛ 2α<br />
+ ln I s<br />
F MODEL OBS<br />
s<br />
F s<br />
⎞ ⎞ ⎞<br />
⎜ 0 ⎜<br />
⎟ ⎟ ⎟<br />
ε s<br />
β s ⎝ ⎝ ε s<br />
β s ⎠ ⎠ ⎟ +<br />
⎠<br />
⎛<br />
+K cs s<br />
− α 2 MODEL<br />
s ( F s ) 2<br />
⎛ ⎛<br />
⎜<br />
+ ln cosh α F s s<br />
⎜ ⎜<br />
⎜ 2ε s<br />
β s ⎝ ⎝ ε s<br />
β<br />
⎝<br />
s<br />
MODEL F s<br />
OBS<br />
⎞ ⎞ ⎞<br />
⎟ ⎟ ⎟<br />
⎠ ⎠ ⎟<br />
⎠<br />
E DATA<br />
=<br />
∑<br />
grid points<br />
€<br />
( )<br />
ρ best<br />
− kρ calc<br />
2<br />
+ wTRESTRAINTS
Structure <strong><strong>refine</strong>ment</strong><br />
1. Model parameters<br />
2. Optimization goal<br />
3. Optimization method
Refinement target optimization algorithms<br />
Gradient-driven minimization<br />
- Follows the local gradient.<br />
- The target function depends on<br />
many parameters – many local<br />
minima.<br />
Simulated annealing (SA)<br />
- Good at escaping local minima.<br />
- Increased probability of finding a<br />
better solution because motion against<br />
the gradient is allowed.<br />
Local<br />
minimum<br />
Global minimum<br />
Deeper local<br />
minimum<br />
Global minimum<br />
Grid search<br />
- Robust but time inefficient. Good for small number of parameters (2-3 or so),<br />
and impractical for larger number of parameters.
Crystallographic <strong>structure</strong> <strong><strong>refine</strong>ment</strong> in PHENIX
<strong>phenix</strong>.<strong>refine</strong><br />
Highly-automated state-of-the-art <strong>structure</strong> <strong><strong>refine</strong>ment</strong> tool of PHENIX<br />
Active development mainly at Lawrence Berkeley National Laboratory:<br />
- Paul Adams<br />
- Pavel Afonine<br />
- Nathaniel Echols<br />
- Ralf Grosse-Kunstleve<br />
- Jeff Headd<br />
- Nigel Moriarty<br />
- Peter Zwart<br />
+ valuable scientific support by many others (Marat Mustyakimov, Sasha<br />
Urzhumtsev, Vladimir Lunin, …)
Automation of <strong>structure</strong> <strong><strong>refine</strong>ment</strong><br />
What used to be in the past … and often still the case nowadays<br />
Acta Cryst. (2002). D58, 2009-2017, Yousef et al.<br />
Clearly, the modern software should do all these steps automatically<br />
PHENIX is making a good progress in achieving this goal
<strong>phenix</strong>.<strong>refine</strong>: single program for a very broad range of resolutions<br />
Low Medium and High Subatomic<br />
- Group ADP <strong><strong>refine</strong>ment</strong><br />
- Rigid body <strong><strong>refine</strong>ment</strong><br />
- Torsion Angle dynamics<br />
- Restrained <strong><strong>refine</strong>ment</strong> (xyz,<br />
ADP: isotropic, anisotropic,<br />
mixed)<br />
- Automatic water picking<br />
- Bond density model<br />
- Unrestrained <strong><strong>refine</strong>ment</strong><br />
- FFT or direct<br />
- Explicit hydrogens<br />
- Automatic NCS restraints<br />
- Simulated Annealing<br />
- Automatic side chain rotamer fixing<br />
- Occupancies (individual, group, automatic<br />
constrains for alternative conformations)<br />
- Various targets: LS, ML, MLHL,…<br />
- TLS <strong><strong>refine</strong>ment</strong><br />
- Use hydrogens at any resolution<br />
- Refinement <strong>with</strong> twinned data<br />
- X-ray, Neutron, joint X-ray + Neutron<br />
- Built-in water picking and <strong><strong>refine</strong>ment</strong>
Refine any part of a model <strong>with</strong> any strategy: all in one run<br />
+ Automatic water picking<br />
+ Simulated Annealing<br />
+ Add and use hydrogens
Designed to be very easy to use<br />
Several ways of running:<br />
- command line version:<br />
Running <strong>phenix</strong>.<strong>refine</strong><br />
<strong>phenix</strong>.<strong>refine</strong> model.pdb data.hkl [parameters]<br />
o Can be highly customized (more than 300 parameters available to change)<br />
- can be called from a Python script allowing to run it <strong>with</strong>in different<br />
contexts<br />
- GUI version
<strong>phenix</strong>.<strong>refine</strong> GUI
Refinement flowchart<br />
PDB model,<br />
Any data format<br />
(CNS, Shelx, MTZ, …)<br />
Input data and model processing<br />
Refinement strategy selection<br />
Bulk-solvent, Anisotropic scaling, Twinning<br />
parameters <strong><strong>refine</strong>ment</strong><br />
Ordered solvent (add / remove)<br />
Target weights calculation<br />
Coordinate <strong><strong>refine</strong>ment</strong> (real- and reciprocal space)<br />
(rigid body, individual) (minimization or Simulated<br />
Annealing)<br />
ADP <strong><strong>refine</strong>ment</strong><br />
(TLS, group, individual iso / aniso)<br />
Occupancy <strong><strong>refine</strong>ment</strong> (individual, group)<br />
Output: Refined model, various maps, <strong>structure</strong><br />
factors, complete statistics, ready for deposition PDB<br />
file<br />
Repeated<br />
several times<br />
Files for<br />
COOT, O,<br />
PyMol
Effect of anisotropic scaling (U CRYSTAL )<br />
Total model <strong>structure</strong> factor used in <strong><strong>refine</strong>ment</strong> and map calculation:<br />
Anisotropic scaling (PDB: 2mhr)<br />
R-factor<br />
Effect of Bulk Solvent<br />
R-factor<br />
No correction<br />
Inoptimal ksol, Bsol<br />
PHENIX<br />
Resolution, Å<br />
Resolution, Å<br />
A robust bulk-solvent correction and anisotropic scaling procedure. Acta Cryst. (2005).<br />
P.V. Afonine, R.W. Grosse-Kunstleve & P.D. Adams
Automatic Water Picking<br />
Input data and model processing<br />
Refinement strategy selection<br />
Bulk-solvent, Anisotropic scaling, Twinning<br />
parameters <strong><strong>refine</strong>ment</strong><br />
Ordered solvent (water picking)<br />
Target weights calculation<br />
Coordinate <strong><strong>refine</strong>ment</strong><br />
(rigid body, individual) (minimization or SA)<br />
ADP <strong><strong>refine</strong>ment</strong><br />
(TLS, group, individual iso / aniso)<br />
Occupancy <strong><strong>refine</strong>ment</strong> (individual, group)<br />
Output: Refined model, various maps,<br />
<strong>structure</strong> factors, complete statistics, ready for<br />
deposition PDB file<br />
Water picking done <strong>with</strong>in <strong><strong>refine</strong>ment</strong><br />
- remove “bad” water:<br />
• 2mFo-DFc (peak hight)<br />
• Distances<br />
• map CC (2mFo-DFc, Fc)<br />
• B-factors and anisotropy<br />
• occupancy<br />
- add new:<br />
• mFo-DFc,<br />
• distances<br />
- <strong>refine</strong> water’s ADP before adding to model<br />
Water update (picking / <strong><strong>refine</strong>ment</strong> / removing) has to be a highly integrated<br />
part of <strong><strong>refine</strong>ment</strong> process.
Rigid body <strong><strong>refine</strong>ment</strong><br />
Low Resolution High<br />
?<br />
Challenges:<br />
Solution<br />
Target function profile<br />
- Use low resolution to assure better convergence to a solution<br />
- Using too low resolution may not be good too (map too featureless)<br />
- Using higher resolution data assures more precise solution but the target<br />
function profile becomes too complex so the model can get caught in a<br />
local minimum<br />
- How to define low-high resolution border (3…4…6A)?<br />
PHENIX MZ protocol makes all these decisions automatically!<br />
Automatic multiple-zone rigid-body <strong><strong>refine</strong>ment</strong> <strong>with</strong> a large convergence radius.<br />
P. V. Afonine, R. W. Grosse-Kunstleve, A. Urzhumtsev and P. D. Adams<br />
J. Appl. Cryst. 42, 607-615 (2009)
<strong>Automated</strong> Rigid Body Refinement: PHENIX MZ protocol<br />
Lowest Low High Highest<br />
Resolution<br />
Automatically define<br />
lowest usable resolution<br />
zone and do RBR there.<br />
This insures quick and<br />
reliable convergence.<br />
Gradually add higher resolution reflections. This<br />
supports the convergence and assures higher<br />
precision of the solution.<br />
Large model movements are expected during rigid body <strong><strong>refine</strong>ment</strong> which invalidate<br />
the solvent mask, therefore it is updated at each step.<br />
All parameters used in the protocol are optimized to achieve the highest<br />
convergence radius at minimal runtime.<br />
- This is done by the grid search over ~100000 trial <strong><strong>refine</strong>ment</strong>s using more than<br />
100 different <strong>structure</strong>s.
Atomic Displacement Parameters (ADP or “B-factors”)<br />
Total atomic ADP U TOTAL = U CRYSTAL + U TLS + U ATOM<br />
<br />
<br />
<br />
U CRYSTAL - overall anisotropic scale w.r.t. cell axes (6 parameters).<br />
U TLS - rigid body displacements of molecules, domains, secondary<br />
<strong>structure</strong> elements. U TLS = T + ALA t + AS + S t A t (20 TLS parameters<br />
per group).<br />
U ATOM - vibration of individual atoms. Should obey Hirshfeld’s rigid bond<br />
postulate.<br />
Group<br />
Individual or<br />
group isotropic<br />
Individual<br />
isotropic<br />
Individual isoor<br />
anisotropic<br />
Individual<br />
anisotropic<br />
Low 3.5Å 3.0Å 2.0Å 1.5Å High
Restraints in <strong><strong>refine</strong>ment</strong> of individual ADP (Atomic Displacement Parameters)<br />
Refinement of isotropic ADP<br />
Refinement of anisotropic ADP<br />
Restraints<br />
Restraints<br />
A bond is almost rigid, therefore the ADPs of bonded atoms are similar (Hirshfeld, 1976);<br />
ADPs of spatially close (non-bonded) atoms are similar (Schneider, 1996);<br />
The bond rigidity, and therefore the difference between the ADPs of bonded atoms, is related to the absolute<br />
values of ADPs. Atoms <strong>with</strong> higher ADPs can have larger differences (Ian Tickle, CCP4 BB, March 14, 2003).<br />
E RESTRAINTS<br />
=<br />
N atoms<br />
∑<br />
i=1<br />
⎡<br />
⎢<br />
⎢<br />
⎢<br />
⎢<br />
⎣<br />
M atoms<br />
∑<br />
j=1<br />
1<br />
r ij<br />
distance_power<br />
( U i<br />
− U j ) 2<br />
⎛ U i<br />
+ U j<br />
⎞<br />
⎜ ⎟<br />
⎝ 2 ⎠<br />
average_power<br />
sphereR<br />
⎤<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎦
TLS <strong><strong>refine</strong>ment</strong> in PHENIX: robust and efficient<br />
Highly optimized algorithm based on systematic re-<strong><strong>refine</strong>ment</strong> of ~350 PDB<br />
models<br />
In most of cases <strong>phenix</strong>.<strong>refine</strong> produces better R-factors compared to<br />
published<br />
Don’t crash or get “unstable”<br />
Rwork (PHENIX)<br />
Rfree (PHENIX)<br />
Rwork (PDB)<br />
Rfree (PDB)
ADP <strong><strong>refine</strong>ment</strong> : from group B and TLS to individual anisotropic<br />
Synaptotagmin <strong><strong>refine</strong>ment</strong> at 3.2 Å<br />
CNS<br />
R-free = 34.%<br />
R = 29.%<br />
PHENIX – Isotropic restrained ADP<br />
R-free = 27.7%<br />
R = 24.6%<br />
PHENIX – TLS + Isotropic ADP<br />
R-free = 24.4%<br />
R = 20.7%
ADP <strong><strong>refine</strong>ment</strong>: what goes to PDB<br />
<strong>phenix</strong>.<strong>refine</strong> outputs TOTAL B-factor (iso- and anisotropic):<br />
U TOTAL = U ATOM + U TLS + U CRYST<br />
Isotropic equivalent<br />
ATOM 1 CA ALA 1 37.211 30.126 28.127 1.00 26.82 C<br />
ANISOU 1 CA ALA 1 3397 3397 3397 2634 2634 2634 C<br />
U TOTAL = U ATOM + U TLS + U CRYST<br />
Stored in separate<br />
record in PDB file<br />
header<br />
Atom records are self-consistent:<br />
Straightforward visualization (color by B-factors, or anisotropic ellipsoids)<br />
Straightforward computation of other statistics (R-factors, etc.) – no need<br />
to use external helper programs for any conversions.
Dual-space <strong><strong>refine</strong>ment</strong>: combining real and reciprocal space <strong><strong>refine</strong>ment</strong><br />
Why real-space <strong><strong>refine</strong>ment</strong> ?<br />
• Can be done locally (for example, for a residue or ligand)<br />
• Grid search can be used -> Convergence radius can be dramatically<br />
increased compared to gradient driven-<strong><strong>refine</strong>ment</strong> or SA<br />
• Ordered solvent update can be enabled at earlier stage<br />
Eliminate the tedium of manual work on fixing side chains on graphics
Local real-space <strong><strong>refine</strong>ment</strong><br />
Compute 2mF obs -DF model , mF obs -DF model , F model<br />
maps<br />
for residue in residues:<br />
compute start T- and CC-values for residue<br />
if need_a_fix:<br />
for rotamer in rotamers:<br />
torsion grid search<br />
if is_better:<br />
residue = rotamer<br />
real-space <strong>refine</strong> residue: residue <strong>refine</strong>d<br />
if is_better:<br />
residue = residue <strong>refine</strong>d<br />
update <strong>structure</strong> <strong>with</strong> residue<br />
N macro-cycles<br />
<strong>phenix</strong>.<strong>refine</strong> protocol<br />
Update F model and re-compute 2mF obs -DF model map<br />
Real-space <strong>refine</strong> whole model into 2mF obs -DF model<br />
Validate changes:<br />
compute 2mF obs -DF model , mF obs -DF model and F model<br />
for residue in residues:<br />
if is_better:<br />
restore original residue (discard change)
Real-space <strong><strong>refine</strong>ment</strong>: torsion grid search
Option for automatic side chain flips to avoid clashes<br />
<br />
Apply side chain flips if necessary (Asn/Gln/His) (RLab, Jeff Headd).<br />
Picture: http://molprobity.biochem.duke.edu/<br />
<br />
<br />
<br />
Bad<br />
Always use hydrogens in real-space optimization<br />
Use to build alternative conformations<br />
Improve NCS restraints<br />
Good
Occupancy <strong><strong>refine</strong>ment</strong><br />
Automatic constraints for<br />
occupancies of atoms in<br />
alternate locations<br />
Any user defined selections<br />
for individual and/or group<br />
occupancy <strong><strong>refine</strong>ment</strong> can be<br />
added on top of the automatic<br />
selection.<br />
ATOM 1 N AARG A 192 -5.782 17.932 11.414 0.72 8.38 N<br />
ATOM 2 CA AARG A 192 -6.979 17.425 10.929 0.72 10.12 C<br />
ATOM 3 C AARG A 192 -6.762 16.088 10.271 0.72 7.90 C<br />
ATOM 7 N BARG A 192 -11.719 17.007 9.061 0.28 9.89 N<br />
ATOM 8 CA BARG A 192 -10.495 17.679 9.569 0.28 11.66 C<br />
ATOM 9 C BARG A 192 -9.259 17.590 8.718 0.28 12.76 C<br />
ATOM 549 AU A 34 -23.064 7.146 -23.942 0.78 15.44 Au<br />
ATOM 549 HA3 ARG A 34 -23.064 7.146 -23.942 1.00 15.44 H<br />
ATOM 550 H AARG A 34 -24.447 7.644 -21.715 0.15 8.34 H<br />
ATOM 551 D BARG A 34 -24.447 7.644 -21.715 0.85 7.65 D<br />
ATOM 552 N ARG A 35 -22.459 9.801 -22.791 1.00 8.54 N<br />
ATOM 6 S SO4 1 1.302 1.419 1.560 0.70 13.00<br />
ATOM 7 O1 SO4 1 1.497 1.295 0.118 0.70 11.00<br />
ATOM 8 O2 SO4 1 1.098 0.095 2.140 0.70 10.00<br />
ATOM 9 O3 SO4 1 2.481 2.037 2.159 0.70 14.00<br />
ATOM 10 O4 SO4 1 0.131 2.251 1.823 0.70 12.00
Two steps to perform twin <strong><strong>refine</strong>ment</strong>:<br />
Refinement <strong>with</strong> twinned data<br />
- run <strong>phenix</strong>.xtriage to get twin operator (twin law):<br />
% <strong>phenix</strong>.xtriage data.mtz<br />
- run <strong>phenix</strong>.<strong>refine</strong>:<br />
% <strong>phenix</strong>.<strong>refine</strong> model.pdb data.mtz twin_law="-h-k,k,-l"<br />
Taking twinning into account makes difference:<br />
Interleukin mutant (PDB code: 1l2h)<br />
R/R-free (%)<br />
PHENIX (no twinning): 24.9 / 27.4<br />
PHENIX (twin <strong><strong>refine</strong>ment</strong>): 15.3 / 19.2
Hydrogen atoms in <strong><strong>refine</strong>ment</strong><br />
<strong>phenix</strong>.<strong>refine</strong> offers various options for handling H atoms at any resolution:<br />
- Riding model (low-high resolution)<br />
- Individual atoms (ultrahigh resolution or neutron data)<br />
- Account for scattering contribution or just use to improve the geometry<br />
Expected benefits from using the H atoms in <strong><strong>refine</strong>ment</strong>:<br />
- Improve R-factors<br />
- Improve model geometry (remove bad clashes)<br />
- Model residual density at high resolution or in neutron maps<br />
Example: automatic re-<strong><strong>refine</strong>ment</strong> of 1000 PDB models <strong>with</strong> and <strong>with</strong>out H:<br />
pdb resolution Rfree(no H) – Rfree(<strong>with</strong> H)<br />
1akg 1.1 1.9<br />
1byp 1.75 1.41<br />
1dkp 2.3 0.93<br />
1rgv 2.9 0.50
Refinement using X-ray and Neutron diffraction data<br />
Different techniques – different information<br />
2mFo-DFc maps<br />
X-ray (1.8 Å) Neutron (2.2 Å)<br />
Fo-Fc, (H-, D-omit neutron<br />
map), 1.6 Å resolution<br />
+2.6σ, D atoms<br />
-2.9σ, H atoms<br />
Neutron maps show hydrogen atoms
Refinement at subatomic resolution<br />
~340 <strong>structure</strong>s in PDB at resolution higher than 1.0 Å<br />
Aldose Reductase (0.66 Å resolution)<br />
Fo-Fc (orange)<br />
2Fo-Fc (blue)<br />
<strong>phenix</strong>.<strong>refine</strong> has unique set of tools to correctly <strong>refine</strong> such <strong>structure</strong>s
Modeling at subatomic resolution: IAS model<br />
Basics of IAS model:<br />
Afonine et al, Acta Cryst. D60 (2004)<br />
First practical examples of implementation and use in PHENIX:<br />
Afonine et al, Acta Cryst. D63, 1194-1197 (2007)<br />
IAS modeling in PHENIX<br />
Simple Gaussian is good enough:<br />
j a<br />
b<br />
a and b are pre-computed library for most bond types
IAS modeling: benefits<br />
Improve maps: reduce noise. Before (left) and after (right) adding of IAS.<br />
Find new features: originally wrong water (left) replaced <strong>with</strong> SO4 ion (right)<br />
clearly suggested by improved map after adding IAS
Analyzing <strong><strong>refine</strong>ment</strong> results…
Question: “I got Rwork=18% and Rfree=23% after <strong><strong>refine</strong>ment</strong>, is it a good<br />
result?”<br />
- A very common question<br />
- Answer depends on various factors<br />
Answer:<br />
Quick model quality evaluation<br />
- Yes, it’s likely a good result if the data resolution is around 2.5 Å.<br />
- No, it is very bad result, if the data resolution is 1.0 Å or higher.<br />
One can ask similar questions about other parameters, such as bond/angles<br />
RMSDs, average B-factors, etc…
POLYGON: Crystallographic Model Quality at a Glance<br />
Say you are refining a <strong>structure</strong> at 1.0 Å resolution and the R-factors are:<br />
R work = 18% and R free is 23%.<br />
- Are these values good? Is <strong><strong>refine</strong>ment</strong> completed?<br />
PDB statistics: histograms for R work , R free , R free -R work for all similar <strong>structure</strong>s:<br />
Rwork at 0.9-1.1Å<br />
0.10 - 0.12: 68<br />
0.12 - 0.14: 94<br />
0.14 - 0.16: 73<br />
0.16 - 0.18: 17
New tool: <strong>phenix</strong>.polygon<br />
Likely good model<br />
This model needs some attention<br />
Acta Cryst. (2009). D65, 297-300
R-factors (all models in PDB at resolution < 1 Å )<br />
R-factor, %<br />
Resolution, Å<br />
PDB code<br />
(year)<br />
published<br />
R-work, %<br />
<strong>phenix</strong>.<strong>refine</strong><br />
2ppn (2007) 20.9 11.7<br />
1g2y (2000) 19.5 12.3<br />
1zlb (2005) 16.8 12.0<br />
2g6f (2006) 18.4 12.9<br />
2elg (2007) 23.2 13.0<br />
1aho (1997) 16.3 9.6<br />
1zf5 (2005) 29.0 16.9<br />
There are ~25 models out of 324 that have suspiciously high or very high R-<br />
factors.<br />
- For most of them the R-factors can be decreased to typical for this<br />
resolution values (~10-15%) in one <strong>phenix</strong>.<strong>refine</strong> run.<br />
<strong>Automated</strong> software <strong>with</strong> integrated validation would immediately flag these<br />
models as suspicious.
PDB: 1eic<br />
Resolution: 1.4Å<br />
Deposition year:<br />
2000<br />
PUBLISHED:<br />
Rwork = 20%<br />
Rfree = 25%
Under-<strong>refine</strong>d models or why automation is important<br />
Structure from PDB: 1eic (resolution = 1.4Å; deposition year: 2000)<br />
PUBLISHED: Rwork = 20% Rfree = 25%<br />
Clear problems:<br />
- No ‘riding’ H atoms;<br />
- All atoms are isotropic;<br />
Potential problems<br />
- Inoptimal weights, <strong><strong>refine</strong>ment</strong> is not converged, incomplete solvent model<br />
Fixing the model <strong>with</strong> PHENIX:<br />
- Add and <strong>refine</strong> H as riding model<br />
- Update ordered solvent<br />
- Refine atoms as anisotropic (except H and water)<br />
- Optimize X-ray/Restraints weights<br />
FINAL MODEL: Rwork = 14% Rfree = 17%<br />
All this could be done by the software automatically, preventing deposition of under<strong>refine</strong>d<br />
models into PDB
Final notes…
www.<strong>phenix</strong>-online.org<br />
This<br />
presentation<br />
(PDF file)
Reporting bugs, problems, asking questions<br />
Something didn’t work as expected?... program crashed?... missing<br />
feature?...<br />
Not Good: silently give up and run away looking for alternative software.<br />
Good: report us a problem, ask a question, request a feature (explain why<br />
it’s good to have), ask for help.<br />
Reporting a bug:<br />
Not good: “Hi! <strong>phenix</strong>.<strong>refine</strong> crashed, I don’t know what to do.”<br />
Good: “Hi! <strong>phenix</strong>.<strong>refine</strong> crashed. Here are:<br />
1) PHENIX version;<br />
2) Command and parameters I used;<br />
3) Input and output files (at least logs).”<br />
Subscribe to PHENIX bulletin board: www.<strong>phenix</strong>-online.org
PHENIX use (April 28, 2010)<br />
Number of <strong>structure</strong>s in PDB <strong>with</strong> “REMARK 3 PROGRAM PHENIX”<br />
948<br />
435<br />
1 7<br />
96<br />
163<br />
2005 2006 2007 2008<br />
Year<br />
2009 2010<br />
User tested !
Acknowledgments<br />
All PHENIX developers<br />
All PHENIX users<br />
Non-PHENIX colleagues for scientific support, fruitful collaboration and<br />
valuable feedback:<br />
Sacha Urzhumtsev, Vladimir Lunin, Dale Tronrud, Dusan Turk<br />
• Funding:<br />
- NIH/NIGMS:<br />
P01GM063210, P50GM062412, P01GM064692, R01GM071939<br />
- Lawrence Berkeley Laboratory<br />
- PHENIX Industrial Consortium