Reviews in Computational Chemistry Volume 18
Reviews in Computational Chemistry Volume 18
Reviews in Computational Chemistry Volume 18
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
54 The Use of Scor<strong>in</strong>g Functions <strong>in</strong> Drug Discovery Applications<br />
that use this concept are called ‘‘empirical scor<strong>in</strong>g functions.’’ Several reviews<br />
summarize details of <strong>in</strong>dividual parameterizations. 17,146–151 The <strong>in</strong>dividual<br />
terms <strong>in</strong> empirical scor<strong>in</strong>g functions are usually chosen so as to <strong>in</strong>tuitively<br />
cover important contributions of the total b<strong>in</strong>d<strong>in</strong>g free energy. Most empirical<br />
scor<strong>in</strong>g functions are derived by evaluat<strong>in</strong>g the functions fi for a set of prote<strong>in</strong>–<br />
ligand complexes and fitt<strong>in</strong>g the coefficients Gi to experimental b<strong>in</strong>d<strong>in</strong>g<br />
aff<strong>in</strong>ities of these complexes by multiple l<strong>in</strong>ear regression or by supervised<br />
learn<strong>in</strong>g techniques. The relative weight of the <strong>in</strong>dividual contributions<br />
depends on the tra<strong>in</strong><strong>in</strong>g set. Usually, between 50 and 100 complexes are<br />
used to derive the weight<strong>in</strong>g factors, but <strong>in</strong> a recent study it was shown that<br />
many more than 100 complexes were needed to achieve convergence. 75 The<br />
reason for this large number is probably due to the fact that the publicly available<br />
prote<strong>in</strong>–ligand complexes fall <strong>in</strong> a few heavily populated classes of<br />
prote<strong>in</strong>s, such that <strong>in</strong> small sets of complexes few <strong>in</strong>teraction types dom<strong>in</strong>ate.<br />
Empirical scor<strong>in</strong>g functions usually conta<strong>in</strong> <strong>in</strong>dividual terms for hydrogen<br />
bonds, ionic <strong>in</strong>teractions, hydrophobic <strong>in</strong>teractions, and for b<strong>in</strong>d<strong>in</strong>g<br />
entropy. Hydrogen bonds are typically scored by simply count<strong>in</strong>g the number<br />
of donor–acceptor pairs fall<strong>in</strong>g with<strong>in</strong> a given distance and angle range considered<br />
to be favorable for hydrogen bond<strong>in</strong>g, weighted by penalty functions for<br />
deviations from preset ideal values. 56,71,73 The amount of error-tolerance <strong>in</strong><br />
these penalty functions is critical to the success of scor<strong>in</strong>g methodology.<br />
When large deviations from ideality are tolerated, the scor<strong>in</strong>g function may<br />
be unable to discrim<strong>in</strong>ate between different orientations of a ligand. Contrarily,<br />
small tolerances lead to situations where many structurally similar complex<br />
structures result <strong>in</strong> very similar scores. Attempts have been made to<br />
reduce the localized nature of such <strong>in</strong>teraction terms by us<strong>in</strong>g cont<strong>in</strong>uous modulat<strong>in</strong>g<br />
functions on an atom-pair basis. 69 Other workers have avoided the use<br />
of penalty functions altogether and <strong>in</strong>troduced separate regression coefficients<br />
for strong, medium, and weak hydrogen bonds. 75 For example, at Agouron a<br />
simple four-parameter potential, which is called the piecewise l<strong>in</strong>ear potential<br />
(PLP), was developed that is an approximation of a potential well without<br />
angular terms. 62 Most empirical scor<strong>in</strong>g functions treat all types of hydrogenbond<br />
<strong>in</strong>teractions equally, but some attempts have been made to<br />
dist<strong>in</strong>guish between different donor–acceptor functional group pairs. Hydrogenbond<br />
scor<strong>in</strong>g <strong>in</strong> the dock<strong>in</strong>g program GOLD, 60,61 for example, is based on<br />
a list of hydrogen-bond energies for all comb<strong>in</strong>ations of 12 def<strong>in</strong>ed donor and<br />
6 acceptor atom types derived from ab <strong>in</strong>itio calculations of model systems<br />
<strong>in</strong>corporat<strong>in</strong>g those atom types. A similar differentiation of donor and acceptor<br />
groups is made <strong>in</strong> the hydrogen-bond functions <strong>in</strong> the program GRID, 152 a<br />
program commonly used for the characterization of b<strong>in</strong>d<strong>in</strong>g sites. 115–117 The<br />
<strong>in</strong>clusion of such lookup tables <strong>in</strong> scor<strong>in</strong>g functions is presumed to avoid<br />
errors orig<strong>in</strong>at<strong>in</strong>g from the oversimplification of <strong>in</strong>dividual <strong>in</strong>teractions.<br />
Reduc<strong>in</strong>g the weight of hydrogen bonds formed at the outer surface of<br />
the b<strong>in</strong>d<strong>in</strong>g site is a useful measure for reduc<strong>in</strong>g the number of false positive