PNNL-13501 - Pacific Northwest National Laboratory
PNNL-13501 - Pacific Northwest National Laboratory
PNNL-13501 - Pacific Northwest National Laboratory
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Study Control Number: PN00077/1484<br />
Rapid NMR Determination of Large Protein Structures<br />
Paul D. Ellis, Michael A. Kennedy, Robert Wind<br />
Functional and structural genomics is the science that correlates protein structure with function in living cells, and<br />
proteomics is the study of the entire complement of proteins expressed by a cell. This project seeks to develop new<br />
methods of nuclear magnetic resonance (NMR) analysis of large protein structures so that data collection times for<br />
analyses are significantly reduced, allowing more proteins to be studied.<br />
Project Description<br />
Dielectric losses associated with biological NMR samples<br />
at high magnetic fields (>17 T) ultimately determine the<br />
potential NMR signal to noise (S/N) ratio. These losses<br />
have a direct effect of lengthening (quadratically) the time<br />
needed to acquire NMR data sets required to deduce the<br />
three-dimensional structure of a protein in solution. The<br />
goal of this research is to develop technical means for<br />
minimizing dielectric losses in biological NMR<br />
experiments. If successful, this could lead to a significant<br />
reduction in NMR data collection times leading to more<br />
rapid structure determinations. One of the most<br />
innovative new ideas for reducing the dielectric losses in<br />
biological NMR samples is to encapsulate proteins in<br />
reverse micelles dissolved in low dielectric bulk solvents.<br />
Preparing and characterizing such systems by small angle<br />
x-ray scattering and NMR spectroscopy will be the<br />
primary focus of this project.<br />
Introduction<br />
In the post-genomic era, the scientific focus is changing<br />
from determining the complete DNA sequence of the<br />
human genome to characterizing the gene products. The<br />
expected number of proteins from the human genome is<br />
~100,000. Proteomics, defined as the study of the entire<br />
complement of proteins expressed by a particular cell,<br />
organism, or tissue type at a given time for a given<br />
disease state or a specific set of environmental conditions,<br />
promises to bridge the gap between genome sequence and<br />
cellular behavior. An integral challenge involves the<br />
characterization of the biological functions of these<br />
proteins and analysis of the corresponding threedimensional<br />
structure. This is referred to as functional<br />
and structural genomics, respectively. Ultimately, the<br />
goal of structural genomics is to determine the structure<br />
of a sufficiently large subset of the approximately<br />
100,000 proteins such that a “basis” structure could be<br />
formulated. This basis would, in turn, be used as a<br />
predictor of the remaining structures while simultaneously<br />
providing a rationale for the observed function of all of<br />
the gene products.<br />
Currently, x-ray crystallography requires diffractionquality<br />
crystals, selenium-labeled proteins, and access to<br />
high-intensity light sources available at the DOE national<br />
laboratory synchrotrons. Given these conditions, x-ray<br />
data sets can be collected in a matter of 2 to 4 hours and<br />
virtually have no limitation with respect to protein size.<br />
However, dynamic or statically disordered regions of<br />
proteins are invisible to x-ray crystallography. NMR<br />
spectroscopy has several unique capabilities that are<br />
complementary to x-ray crystallography: 1) proteins can<br />
be examined under physiological solution-state<br />
conditions; 2) dynamic regions of the proteins can be well<br />
characterized; 3) intermolecular complexes can be easily<br />
studied as a function of pH, ionic strength, etc. However,<br />
NMR currently has two significant limitations when<br />
compared to x-ray crystallography: 1) data sets currently<br />
take about 60 days to collect, and 2) the size of proteins<br />
amenable to NMR structure determination is currently<br />
limited to ~50 kDa.<br />
If the structures of a large subset among approximately<br />
100,000 proteins must be determined, then x-ray<br />
crystallography data collection would require maximally<br />
between 24 and 45 years of synchrotron time (assuming<br />
proteins protein crystals are obtainable at high enough<br />
quality in all cases). On the other hand, for this same<br />
subset, given the current technology, NMR spectroscopy<br />
would require approximately 16,438 years of NMR time.<br />
Estimates of the cost per structure with current<br />
technologies are running around $50,000 to $100,000<br />
each. The cost of such an effort would be between<br />
$5 billion and $10 billion. Ideally, one could imagine<br />
dividing the task of structural genomics equally between<br />
x-ray crystallography and NMR, but, at the current state<br />
Biosciences and Biotechnology 95