21.05.2014 Views

Automated structure refinement with phenix.refine

Automated structure refinement with phenix.refine

Automated structure refinement with phenix.refine

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Automated</strong> <strong>structure</strong> <strong><strong>refine</strong>ment</strong> <strong>with</strong> <strong>phenix</strong>.<strong>refine</strong><br />

Pavel Afonine<br />

Computation Crystallography Initiative<br />

Physical Biosciences Division<br />

Lawrence Berkeley National Laboratory, Berkeley CA, USA<br />

PHENIX workshop<br />

Houston, April 30, 2010


Structure <strong><strong>refine</strong>ment</strong><br />

Crystallographic <strong>structure</strong> determination workflow<br />

Purified<br />

object<br />

Crystals<br />

Experimental<br />

Data<br />

Initial<br />

approximate<br />

model<br />

Model<br />

Re-building<br />

Refinement<br />

Validation<br />

And<br />

analysis<br />

Deposition<br />

(publishing)<br />

Structure <strong><strong>refine</strong>ment</strong>: modify model parameters to describe the<br />

experimental data as good as possible<br />

Initial model<br />

Calculate<br />

model <strong>structure</strong><br />

factors<br />

Correct for bulk<br />

solvent and<br />

other scaling<br />

Modify<br />

model<br />

parameters<br />

Improved model


Structure <strong><strong>refine</strong>ment</strong><br />

1. Model parameters<br />

2. Optimization goal<br />

3. Optimization method


Model parameters<br />

Crystal (unit cell)<br />

Bulk solvent<br />

- Macromolecular crystals contain ~20-80% of solvent (mostly disordered)<br />

Atomic model<br />

- Position (coordinates)<br />

- Mobility (ADP; Atomic Displacement Parameters)<br />

- Disorder (occupancies)


Model parameters<br />

Atomic model parameterization depends on:<br />

- quality of experimental data (resolution, completeness, …)<br />

- quality of current model (initial <strong>with</strong> large errors, almost final, …)<br />

- data-to-parameters ratio (restraints have to be accounted for)


Individual<br />

atoms<br />

Model parameterization: coordinates<br />

Constrained rigid bodies (torsion<br />

angle parameterization)<br />

Rigid body<br />

3 * Natoms<br />

3 * Natoms / (7 …10)<br />

6 * Ngroups<br />

High Resolution Low<br />

Some a priori information may be needed such as:<br />

- Stereochemistry restraints<br />

- NCS restraints or constraints


Model parameterization: Atomic Displacement Parameters (“B-factors”)<br />

Nesting of atomic displacements<br />

atom<br />

residue<br />

domain<br />

molecule<br />

crystal<br />

Currently one can split and distinguish only a limited number of movements<br />

Δr total = Δr crystal +Δr domain + Δr atom<br />

Total movement (U TOTAL )<br />

Collective movement (U TLS )<br />

Local movement of<br />

independent atoms (U ATOM )<br />

=<br />

+


Structure <strong><strong>refine</strong>ment</strong><br />

1. Model parameters<br />

2. Optimization goal<br />

3. Optimization method


Target function<br />

Least-Squares target<br />

E DATA<br />

=<br />

∑<br />

s<br />

Maximum-Likelihood target<br />

⎛<br />

E DATA € = ∑ (1− K cs s<br />

) ⎜ − α s<br />

⎜<br />

s<br />

⎝<br />

Real space target<br />

2 F s<br />

MODEL<br />

w ( s<br />

F OBS MODEL<br />

s<br />

− kF ) 2<br />

s<br />

( ) 2<br />

⎛ ⎛ 2α<br />

+ ln I s<br />

F MODEL OBS<br />

s<br />

F s<br />

⎞ ⎞ ⎞<br />

⎜ 0 ⎜<br />

⎟ ⎟ ⎟<br />

ε s<br />

β s ⎝ ⎝ ε s<br />

β s ⎠ ⎠ ⎟ +<br />

⎠<br />

⎛<br />

+K cs s<br />

− α 2 MODEL<br />

s ( F s ) 2<br />

⎛ ⎛<br />

⎜<br />

+ ln cosh α F s s<br />

⎜ ⎜<br />

⎜ 2ε s<br />

β s ⎝ ⎝ ε s<br />

β<br />

⎝<br />

s<br />

MODEL F s<br />

OBS<br />

⎞ ⎞ ⎞<br />

⎟ ⎟ ⎟<br />

⎠ ⎠ ⎟<br />

⎠<br />

E DATA<br />

=<br />

∑<br />

grid points<br />

€<br />

( )<br />

ρ best<br />

− kρ calc<br />

2<br />

+ wTRESTRAINTS


Structure <strong><strong>refine</strong>ment</strong><br />

1. Model parameters<br />

2. Optimization goal<br />

3. Optimization method


Refinement target optimization algorithms<br />

Gradient-driven minimization<br />

- Follows the local gradient.<br />

- The target function depends on<br />

many parameters – many local<br />

minima.<br />

Simulated annealing (SA)<br />

- Good at escaping local minima.<br />

- Increased probability of finding a<br />

better solution because motion against<br />

the gradient is allowed.<br />

Local<br />

minimum<br />

Global minimum<br />

Deeper local<br />

minimum<br />

Global minimum<br />

Grid search<br />

- Robust but time inefficient. Good for small number of parameters (2-3 or so),<br />

and impractical for larger number of parameters.


Crystallographic <strong>structure</strong> <strong><strong>refine</strong>ment</strong> in PHENIX


<strong>phenix</strong>.<strong>refine</strong><br />

Highly-automated state-of-the-art <strong>structure</strong> <strong><strong>refine</strong>ment</strong> tool of PHENIX<br />

Active development mainly at Lawrence Berkeley National Laboratory:<br />

- Paul Adams<br />

- Pavel Afonine<br />

- Nathaniel Echols<br />

- Ralf Grosse-Kunstleve<br />

- Jeff Headd<br />

- Nigel Moriarty<br />

- Peter Zwart<br />

+ valuable scientific support by many others (Marat Mustyakimov, Sasha<br />

Urzhumtsev, Vladimir Lunin, …)


Automation of <strong>structure</strong> <strong><strong>refine</strong>ment</strong><br />

What used to be in the past … and often still the case nowadays<br />

Acta Cryst. (2002). D58, 2009-2017, Yousef et al.<br />

Clearly, the modern software should do all these steps automatically<br />

PHENIX is making a good progress in achieving this goal


<strong>phenix</strong>.<strong>refine</strong>: single program for a very broad range of resolutions<br />

Low Medium and High Subatomic<br />

- Group ADP <strong><strong>refine</strong>ment</strong><br />

- Rigid body <strong><strong>refine</strong>ment</strong><br />

- Torsion Angle dynamics<br />

- Restrained <strong><strong>refine</strong>ment</strong> (xyz,<br />

ADP: isotropic, anisotropic,<br />

mixed)<br />

- Automatic water picking<br />

- Bond density model<br />

- Unrestrained <strong><strong>refine</strong>ment</strong><br />

- FFT or direct<br />

- Explicit hydrogens<br />

- Automatic NCS restraints<br />

- Simulated Annealing<br />

- Automatic side chain rotamer fixing<br />

- Occupancies (individual, group, automatic<br />

constrains for alternative conformations)<br />

- Various targets: LS, ML, MLHL,…<br />

- TLS <strong><strong>refine</strong>ment</strong><br />

- Use hydrogens at any resolution<br />

- Refinement <strong>with</strong> twinned data<br />

- X-ray, Neutron, joint X-ray + Neutron<br />

- Built-in water picking and <strong><strong>refine</strong>ment</strong>


Refine any part of a model <strong>with</strong> any strategy: all in one run<br />

+ Automatic water picking<br />

+ Simulated Annealing<br />

+ Add and use hydrogens


Designed to be very easy to use<br />

Several ways of running:<br />

- command line version:<br />

Running <strong>phenix</strong>.<strong>refine</strong><br />

<strong>phenix</strong>.<strong>refine</strong> model.pdb data.hkl [parameters]<br />

o Can be highly customized (more than 300 parameters available to change)<br />

- can be called from a Python script allowing to run it <strong>with</strong>in different<br />

contexts<br />

- GUI version


<strong>phenix</strong>.<strong>refine</strong> GUI


Refinement flowchart<br />

PDB model,<br />

Any data format<br />

(CNS, Shelx, MTZ, …)<br />

Input data and model processing<br />

Refinement strategy selection<br />

Bulk-solvent, Anisotropic scaling, Twinning<br />

parameters <strong><strong>refine</strong>ment</strong><br />

Ordered solvent (add / remove)<br />

Target weights calculation<br />

Coordinate <strong><strong>refine</strong>ment</strong> (real- and reciprocal space)<br />

(rigid body, individual) (minimization or Simulated<br />

Annealing)<br />

ADP <strong><strong>refine</strong>ment</strong><br />

(TLS, group, individual iso / aniso)<br />

Occupancy <strong><strong>refine</strong>ment</strong> (individual, group)<br />

Output: Refined model, various maps, <strong>structure</strong><br />

factors, complete statistics, ready for deposition PDB<br />

file<br />

Repeated<br />

several times<br />

Files for<br />

COOT, O,<br />

PyMol


Effect of anisotropic scaling (U CRYSTAL )<br />

Total model <strong>structure</strong> factor used in <strong><strong>refine</strong>ment</strong> and map calculation:<br />

Anisotropic scaling (PDB: 2mhr)<br />

R-factor<br />

Effect of Bulk Solvent<br />

R-factor<br />

No correction<br />

Inoptimal ksol, Bsol<br />

PHENIX<br />

Resolution, Å<br />

Resolution, Å<br />

A robust bulk-solvent correction and anisotropic scaling procedure. Acta Cryst. (2005).<br />

P.V. Afonine, R.W. Grosse-Kunstleve & P.D. Adams


Automatic Water Picking<br />

Input data and model processing<br />

Refinement strategy selection<br />

Bulk-solvent, Anisotropic scaling, Twinning<br />

parameters <strong><strong>refine</strong>ment</strong><br />

Ordered solvent (water picking)<br />

Target weights calculation<br />

Coordinate <strong><strong>refine</strong>ment</strong><br />

(rigid body, individual) (minimization or SA)<br />

ADP <strong><strong>refine</strong>ment</strong><br />

(TLS, group, individual iso / aniso)<br />

Occupancy <strong><strong>refine</strong>ment</strong> (individual, group)<br />

Output: Refined model, various maps,<br />

<strong>structure</strong> factors, complete statistics, ready for<br />

deposition PDB file<br />

Water picking done <strong>with</strong>in <strong><strong>refine</strong>ment</strong><br />

- remove “bad” water:<br />

• 2mFo-DFc (peak hight)<br />

• Distances<br />

• map CC (2mFo-DFc, Fc)<br />

• B-factors and anisotropy<br />

• occupancy<br />

- add new:<br />

• mFo-DFc,<br />

• distances<br />

- <strong>refine</strong> water’s ADP before adding to model<br />

Water update (picking / <strong><strong>refine</strong>ment</strong> / removing) has to be a highly integrated<br />

part of <strong><strong>refine</strong>ment</strong> process.


Rigid body <strong><strong>refine</strong>ment</strong><br />

Low Resolution High<br />

?<br />

Challenges:<br />

Solution<br />

Target function profile<br />

- Use low resolution to assure better convergence to a solution<br />

- Using too low resolution may not be good too (map too featureless)<br />

- Using higher resolution data assures more precise solution but the target<br />

function profile becomes too complex so the model can get caught in a<br />

local minimum<br />

- How to define low-high resolution border (3…4…6A)?<br />

PHENIX MZ protocol makes all these decisions automatically!<br />

Automatic multiple-zone rigid-body <strong><strong>refine</strong>ment</strong> <strong>with</strong> a large convergence radius.<br />

P. V. Afonine, R. W. Grosse-Kunstleve, A. Urzhumtsev and P. D. Adams<br />

J. Appl. Cryst. 42, 607-615 (2009)


<strong>Automated</strong> Rigid Body Refinement: PHENIX MZ protocol<br />

Lowest Low High Highest<br />

Resolution<br />

Automatically define<br />

lowest usable resolution<br />

zone and do RBR there.<br />

This insures quick and<br />

reliable convergence.<br />

Gradually add higher resolution reflections. This<br />

supports the convergence and assures higher<br />

precision of the solution.<br />

Large model movements are expected during rigid body <strong><strong>refine</strong>ment</strong> which invalidate<br />

the solvent mask, therefore it is updated at each step.<br />

All parameters used in the protocol are optimized to achieve the highest<br />

convergence radius at minimal runtime.<br />

- This is done by the grid search over ~100000 trial <strong><strong>refine</strong>ment</strong>s using more than<br />

100 different <strong>structure</strong>s.


Atomic Displacement Parameters (ADP or “B-factors”)<br />

Total atomic ADP U TOTAL = U CRYSTAL + U TLS + U ATOM<br />

<br />

<br />

<br />

U CRYSTAL - overall anisotropic scale w.r.t. cell axes (6 parameters).<br />

U TLS - rigid body displacements of molecules, domains, secondary<br />

<strong>structure</strong> elements. U TLS = T + ALA t + AS + S t A t (20 TLS parameters<br />

per group).<br />

U ATOM - vibration of individual atoms. Should obey Hirshfeld’s rigid bond<br />

postulate.<br />

Group<br />

Individual or<br />

group isotropic<br />

Individual<br />

isotropic<br />

Individual isoor<br />

anisotropic<br />

Individual<br />

anisotropic<br />

Low 3.5Å 3.0Å 2.0Å 1.5Å High


Restraints in <strong><strong>refine</strong>ment</strong> of individual ADP (Atomic Displacement Parameters)<br />

Refinement of isotropic ADP<br />

Refinement of anisotropic ADP<br />

Restraints<br />

Restraints<br />

A bond is almost rigid, therefore the ADPs of bonded atoms are similar (Hirshfeld, 1976);<br />

ADPs of spatially close (non-bonded) atoms are similar (Schneider, 1996);<br />

The bond rigidity, and therefore the difference between the ADPs of bonded atoms, is related to the absolute<br />

values of ADPs. Atoms <strong>with</strong> higher ADPs can have larger differences (Ian Tickle, CCP4 BB, March 14, 2003).<br />

E RESTRAINTS<br />

=<br />

N atoms<br />

∑<br />

i=1<br />

⎡<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

M atoms<br />

∑<br />

j=1<br />

1<br />

r ij<br />

distance_power<br />

( U i<br />

− U j ) 2<br />

⎛ U i<br />

+ U j<br />

⎞<br />

⎜ ⎟<br />

⎝ 2 ⎠<br />

average_power<br />

sphereR<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />


TLS <strong><strong>refine</strong>ment</strong> in PHENIX: robust and efficient<br />

Highly optimized algorithm based on systematic re-<strong><strong>refine</strong>ment</strong> of ~350 PDB<br />

models<br />

In most of cases <strong>phenix</strong>.<strong>refine</strong> produces better R-factors compared to<br />

published<br />

Don’t crash or get “unstable”<br />

Rwork (PHENIX)<br />

Rfree (PHENIX)<br />

Rwork (PDB)<br />

Rfree (PDB)


ADP <strong><strong>refine</strong>ment</strong> : from group B and TLS to individual anisotropic<br />

Synaptotagmin <strong><strong>refine</strong>ment</strong> at 3.2 Å<br />

CNS<br />

R-free = 34.%<br />

R = 29.%<br />

PHENIX – Isotropic restrained ADP<br />

R-free = 27.7%<br />

R = 24.6%<br />

PHENIX – TLS + Isotropic ADP<br />

R-free = 24.4%<br />

R = 20.7%


ADP <strong><strong>refine</strong>ment</strong>: what goes to PDB<br />

<strong>phenix</strong>.<strong>refine</strong> outputs TOTAL B-factor (iso- and anisotropic):<br />

U TOTAL = U ATOM + U TLS + U CRYST<br />

Isotropic equivalent<br />

ATOM 1 CA ALA 1 37.211 30.126 28.127 1.00 26.82 C<br />

ANISOU 1 CA ALA 1 3397 3397 3397 2634 2634 2634 C<br />

U TOTAL = U ATOM + U TLS + U CRYST<br />

Stored in separate<br />

record in PDB file<br />

header<br />

Atom records are self-consistent:<br />

Straightforward visualization (color by B-factors, or anisotropic ellipsoids)<br />

Straightforward computation of other statistics (R-factors, etc.) – no need<br />

to use external helper programs for any conversions.


Dual-space <strong><strong>refine</strong>ment</strong>: combining real and reciprocal space <strong><strong>refine</strong>ment</strong><br />

Why real-space <strong><strong>refine</strong>ment</strong> ?<br />

• Can be done locally (for example, for a residue or ligand)<br />

• Grid search can be used -> Convergence radius can be dramatically<br />

increased compared to gradient driven-<strong><strong>refine</strong>ment</strong> or SA<br />

• Ordered solvent update can be enabled at earlier stage<br />

Eliminate the tedium of manual work on fixing side chains on graphics


Local real-space <strong><strong>refine</strong>ment</strong><br />

Compute 2mF obs -DF model , mF obs -DF model , F model<br />

maps<br />

for residue in residues:<br />

compute start T- and CC-values for residue<br />

if need_a_fix:<br />

for rotamer in rotamers:<br />

torsion grid search<br />

if is_better:<br />

residue = rotamer<br />

real-space <strong>refine</strong> residue: residue <strong>refine</strong>d<br />

if is_better:<br />

residue = residue <strong>refine</strong>d<br />

update <strong>structure</strong> <strong>with</strong> residue<br />

N macro-cycles<br />

<strong>phenix</strong>.<strong>refine</strong> protocol<br />

Update F model and re-compute 2mF obs -DF model map<br />

Real-space <strong>refine</strong> whole model into 2mF obs -DF model<br />

Validate changes:<br />

compute 2mF obs -DF model , mF obs -DF model and F model<br />

for residue in residues:<br />

if is_better:<br />

restore original residue (discard change)


Real-space <strong><strong>refine</strong>ment</strong>: torsion grid search


Option for automatic side chain flips to avoid clashes<br />

<br />

Apply side chain flips if necessary (Asn/Gln/His) (RLab, Jeff Headd).<br />

Picture: http://molprobity.biochem.duke.edu/<br />

<br />

<br />

<br />

Bad<br />

Always use hydrogens in real-space optimization<br />

Use to build alternative conformations<br />

Improve NCS restraints<br />

Good


Occupancy <strong><strong>refine</strong>ment</strong><br />

Automatic constraints for<br />

occupancies of atoms in<br />

alternate locations<br />

Any user defined selections<br />

for individual and/or group<br />

occupancy <strong><strong>refine</strong>ment</strong> can be<br />

added on top of the automatic<br />

selection.<br />

ATOM 1 N AARG A 192 -5.782 17.932 11.414 0.72 8.38 N<br />

ATOM 2 CA AARG A 192 -6.979 17.425 10.929 0.72 10.12 C<br />

ATOM 3 C AARG A 192 -6.762 16.088 10.271 0.72 7.90 C<br />

ATOM 7 N BARG A 192 -11.719 17.007 9.061 0.28 9.89 N<br />

ATOM 8 CA BARG A 192 -10.495 17.679 9.569 0.28 11.66 C<br />

ATOM 9 C BARG A 192 -9.259 17.590 8.718 0.28 12.76 C<br />

ATOM 549 AU A 34 -23.064 7.146 -23.942 0.78 15.44 Au<br />

ATOM 549 HA3 ARG A 34 -23.064 7.146 -23.942 1.00 15.44 H<br />

ATOM 550 H AARG A 34 -24.447 7.644 -21.715 0.15 8.34 H<br />

ATOM 551 D BARG A 34 -24.447 7.644 -21.715 0.85 7.65 D<br />

ATOM 552 N ARG A 35 -22.459 9.801 -22.791 1.00 8.54 N<br />

ATOM 6 S SO4 1 1.302 1.419 1.560 0.70 13.00<br />

ATOM 7 O1 SO4 1 1.497 1.295 0.118 0.70 11.00<br />

ATOM 8 O2 SO4 1 1.098 0.095 2.140 0.70 10.00<br />

ATOM 9 O3 SO4 1 2.481 2.037 2.159 0.70 14.00<br />

ATOM 10 O4 SO4 1 0.131 2.251 1.823 0.70 12.00


Two steps to perform twin <strong><strong>refine</strong>ment</strong>:<br />

Refinement <strong>with</strong> twinned data<br />

- run <strong>phenix</strong>.xtriage to get twin operator (twin law):<br />

% <strong>phenix</strong>.xtriage data.mtz<br />

- run <strong>phenix</strong>.<strong>refine</strong>:<br />

% <strong>phenix</strong>.<strong>refine</strong> model.pdb data.mtz twin_law="-h-k,k,-l"<br />

Taking twinning into account makes difference:<br />

Interleukin mutant (PDB code: 1l2h)<br />

R/R-free (%)<br />

PHENIX (no twinning): 24.9 / 27.4<br />

PHENIX (twin <strong><strong>refine</strong>ment</strong>): 15.3 / 19.2


Hydrogen atoms in <strong><strong>refine</strong>ment</strong><br />

<strong>phenix</strong>.<strong>refine</strong> offers various options for handling H atoms at any resolution:<br />

- Riding model (low-high resolution)<br />

- Individual atoms (ultrahigh resolution or neutron data)<br />

- Account for scattering contribution or just use to improve the geometry<br />

Expected benefits from using the H atoms in <strong><strong>refine</strong>ment</strong>:<br />

- Improve R-factors<br />

- Improve model geometry (remove bad clashes)<br />

- Model residual density at high resolution or in neutron maps<br />

Example: automatic re-<strong><strong>refine</strong>ment</strong> of 1000 PDB models <strong>with</strong> and <strong>with</strong>out H:<br />

pdb resolution Rfree(no H) – Rfree(<strong>with</strong> H)<br />

1akg 1.1 1.9<br />

1byp 1.75 1.41<br />

1dkp 2.3 0.93<br />

1rgv 2.9 0.50


Refinement using X-ray and Neutron diffraction data<br />

Different techniques – different information<br />

2mFo-DFc maps<br />

X-ray (1.8 Å) Neutron (2.2 Å)<br />

Fo-Fc, (H-, D-omit neutron<br />

map), 1.6 Å resolution<br />

+2.6σ, D atoms<br />

-2.9σ, H atoms<br />

Neutron maps show hydrogen atoms


Refinement at subatomic resolution<br />

~340 <strong>structure</strong>s in PDB at resolution higher than 1.0 Å<br />

Aldose Reductase (0.66 Å resolution)<br />

Fo-Fc (orange)<br />

2Fo-Fc (blue)<br />

<strong>phenix</strong>.<strong>refine</strong> has unique set of tools to correctly <strong>refine</strong> such <strong>structure</strong>s


Modeling at subatomic resolution: IAS model<br />

Basics of IAS model:<br />

Afonine et al, Acta Cryst. D60 (2004)<br />

First practical examples of implementation and use in PHENIX:<br />

Afonine et al, Acta Cryst. D63, 1194-1197 (2007)<br />

IAS modeling in PHENIX<br />

Simple Gaussian is good enough:<br />

j a<br />

b<br />

a and b are pre-computed library for most bond types


IAS modeling: benefits<br />

Improve maps: reduce noise. Before (left) and after (right) adding of IAS.<br />

Find new features: originally wrong water (left) replaced <strong>with</strong> SO4 ion (right)<br />

clearly suggested by improved map after adding IAS


Analyzing <strong><strong>refine</strong>ment</strong> results…


Question: “I got Rwork=18% and Rfree=23% after <strong><strong>refine</strong>ment</strong>, is it a good<br />

result?”<br />

- A very common question<br />

- Answer depends on various factors<br />

Answer:<br />

Quick model quality evaluation<br />

- Yes, it’s likely a good result if the data resolution is around 2.5 Å.<br />

- No, it is very bad result, if the data resolution is 1.0 Å or higher.<br />

One can ask similar questions about other parameters, such as bond/angles<br />

RMSDs, average B-factors, etc…


POLYGON: Crystallographic Model Quality at a Glance<br />

Say you are refining a <strong>structure</strong> at 1.0 Å resolution and the R-factors are:<br />

R work = 18% and R free is 23%.<br />

- Are these values good? Is <strong><strong>refine</strong>ment</strong> completed?<br />

PDB statistics: histograms for R work , R free , R free -R work for all similar <strong>structure</strong>s:<br />

Rwork at 0.9-1.1Å<br />

0.10 - 0.12: 68<br />

0.12 - 0.14: 94<br />

0.14 - 0.16: 73<br />

0.16 - 0.18: 17


New tool: <strong>phenix</strong>.polygon<br />

Likely good model<br />

This model needs some attention<br />

Acta Cryst. (2009). D65, 297-300


R-factors (all models in PDB at resolution < 1 Å )<br />

R-factor, %<br />

Resolution, Å<br />

PDB code<br />

(year)<br />

published<br />

R-work, %<br />

<strong>phenix</strong>.<strong>refine</strong><br />

2ppn (2007) 20.9 11.7<br />

1g2y (2000) 19.5 12.3<br />

1zlb (2005) 16.8 12.0<br />

2g6f (2006) 18.4 12.9<br />

2elg (2007) 23.2 13.0<br />

1aho (1997) 16.3 9.6<br />

1zf5 (2005) 29.0 16.9<br />

There are ~25 models out of 324 that have suspiciously high or very high R-<br />

factors.<br />

- For most of them the R-factors can be decreased to typical for this<br />

resolution values (~10-15%) in one <strong>phenix</strong>.<strong>refine</strong> run.<br />

<strong>Automated</strong> software <strong>with</strong> integrated validation would immediately flag these<br />

models as suspicious.


PDB: 1eic<br />

Resolution: 1.4Å<br />

Deposition year:<br />

2000<br />

PUBLISHED:<br />

Rwork = 20%<br />

Rfree = 25%


Under-<strong>refine</strong>d models or why automation is important<br />

Structure from PDB: 1eic (resolution = 1.4Å; deposition year: 2000)<br />

PUBLISHED: Rwork = 20% Rfree = 25%<br />

Clear problems:<br />

- No ‘riding’ H atoms;<br />

- All atoms are isotropic;<br />

Potential problems<br />

- Inoptimal weights, <strong><strong>refine</strong>ment</strong> is not converged, incomplete solvent model<br />

Fixing the model <strong>with</strong> PHENIX:<br />

- Add and <strong>refine</strong> H as riding model<br />

- Update ordered solvent<br />

- Refine atoms as anisotropic (except H and water)<br />

- Optimize X-ray/Restraints weights<br />

FINAL MODEL: Rwork = 14% Rfree = 17%<br />

All this could be done by the software automatically, preventing deposition of under<strong>refine</strong>d<br />

models into PDB


Final notes…


www.<strong>phenix</strong>-online.org<br />

This<br />

presentation<br />

(PDF file)


Reporting bugs, problems, asking questions<br />

Something didn’t work as expected?... program crashed?... missing<br />

feature?...<br />

Not Good: silently give up and run away looking for alternative software.<br />

Good: report us a problem, ask a question, request a feature (explain why<br />

it’s good to have), ask for help.<br />

Reporting a bug:<br />

Not good: “Hi! <strong>phenix</strong>.<strong>refine</strong> crashed, I don’t know what to do.”<br />

Good: “Hi! <strong>phenix</strong>.<strong>refine</strong> crashed. Here are:<br />

1) PHENIX version;<br />

2) Command and parameters I used;<br />

3) Input and output files (at least logs).”<br />

Subscribe to PHENIX bulletin board: www.<strong>phenix</strong>-online.org


PHENIX use (April 28, 2010)<br />

Number of <strong>structure</strong>s in PDB <strong>with</strong> “REMARK 3 PROGRAM PHENIX”<br />

948<br />

435<br />

1 7<br />

96<br />

163<br />

2005 2006 2007 2008<br />

Year<br />

2009 2010<br />

User tested !


Acknowledgments<br />

All PHENIX developers<br />

All PHENIX users<br />

Non-PHENIX colleagues for scientific support, fruitful collaboration and<br />

valuable feedback:<br />

Sacha Urzhumtsev, Vladimir Lunin, Dale Tronrud, Dusan Turk<br />

• Funding:<br />

- NIH/NIGMS:<br />

P01GM063210, P50GM062412, P01GM064692, R01GM071939<br />

- Lawrence Berkeley Laboratory<br />

- PHENIX Industrial Consortium

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!