19.02.2013 Views

Reviews in Computational Chemistry Volume 18

Reviews in Computational Chemistry Volume 18

Reviews in Computational Chemistry Volume 18

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

54 The Use of Scor<strong>in</strong>g Functions <strong>in</strong> Drug Discovery Applications<br />

that use this concept are called ‘‘empirical scor<strong>in</strong>g functions.’’ Several reviews<br />

summarize details of <strong>in</strong>dividual parameterizations. 17,146–151 The <strong>in</strong>dividual<br />

terms <strong>in</strong> empirical scor<strong>in</strong>g functions are usually chosen so as to <strong>in</strong>tuitively<br />

cover important contributions of the total b<strong>in</strong>d<strong>in</strong>g free energy. Most empirical<br />

scor<strong>in</strong>g functions are derived by evaluat<strong>in</strong>g the functions fi for a set of prote<strong>in</strong>–<br />

ligand complexes and fitt<strong>in</strong>g the coefficients Gi to experimental b<strong>in</strong>d<strong>in</strong>g<br />

aff<strong>in</strong>ities of these complexes by multiple l<strong>in</strong>ear regression or by supervised<br />

learn<strong>in</strong>g techniques. The relative weight of the <strong>in</strong>dividual contributions<br />

depends on the tra<strong>in</strong><strong>in</strong>g set. Usually, between 50 and 100 complexes are<br />

used to derive the weight<strong>in</strong>g factors, but <strong>in</strong> a recent study it was shown that<br />

many more than 100 complexes were needed to achieve convergence. 75 The<br />

reason for this large number is probably due to the fact that the publicly available<br />

prote<strong>in</strong>–ligand complexes fall <strong>in</strong> a few heavily populated classes of<br />

prote<strong>in</strong>s, such that <strong>in</strong> small sets of complexes few <strong>in</strong>teraction types dom<strong>in</strong>ate.<br />

Empirical scor<strong>in</strong>g functions usually conta<strong>in</strong> <strong>in</strong>dividual terms for hydrogen<br />

bonds, ionic <strong>in</strong>teractions, hydrophobic <strong>in</strong>teractions, and for b<strong>in</strong>d<strong>in</strong>g<br />

entropy. Hydrogen bonds are typically scored by simply count<strong>in</strong>g the number<br />

of donor–acceptor pairs fall<strong>in</strong>g with<strong>in</strong> a given distance and angle range considered<br />

to be favorable for hydrogen bond<strong>in</strong>g, weighted by penalty functions for<br />

deviations from preset ideal values. 56,71,73 The amount of error-tolerance <strong>in</strong><br />

these penalty functions is critical to the success of scor<strong>in</strong>g methodology.<br />

When large deviations from ideality are tolerated, the scor<strong>in</strong>g function may<br />

be unable to discrim<strong>in</strong>ate between different orientations of a ligand. Contrarily,<br />

small tolerances lead to situations where many structurally similar complex<br />

structures result <strong>in</strong> very similar scores. Attempts have been made to<br />

reduce the localized nature of such <strong>in</strong>teraction terms by us<strong>in</strong>g cont<strong>in</strong>uous modulat<strong>in</strong>g<br />

functions on an atom-pair basis. 69 Other workers have avoided the use<br />

of penalty functions altogether and <strong>in</strong>troduced separate regression coefficients<br />

for strong, medium, and weak hydrogen bonds. 75 For example, at Agouron a<br />

simple four-parameter potential, which is called the piecewise l<strong>in</strong>ear potential<br />

(PLP), was developed that is an approximation of a potential well without<br />

angular terms. 62 Most empirical scor<strong>in</strong>g functions treat all types of hydrogenbond<br />

<strong>in</strong>teractions equally, but some attempts have been made to<br />

dist<strong>in</strong>guish between different donor–acceptor functional group pairs. Hydrogenbond<br />

scor<strong>in</strong>g <strong>in</strong> the dock<strong>in</strong>g program GOLD, 60,61 for example, is based on<br />

a list of hydrogen-bond energies for all comb<strong>in</strong>ations of 12 def<strong>in</strong>ed donor and<br />

6 acceptor atom types derived from ab <strong>in</strong>itio calculations of model systems<br />

<strong>in</strong>corporat<strong>in</strong>g those atom types. A similar differentiation of donor and acceptor<br />

groups is made <strong>in</strong> the hydrogen-bond functions <strong>in</strong> the program GRID, 152 a<br />

program commonly used for the characterization of b<strong>in</strong>d<strong>in</strong>g sites. 115–117 The<br />

<strong>in</strong>clusion of such lookup tables <strong>in</strong> scor<strong>in</strong>g functions is presumed to avoid<br />

errors orig<strong>in</strong>at<strong>in</strong>g from the oversimplification of <strong>in</strong>dividual <strong>in</strong>teractions.<br />

Reduc<strong>in</strong>g the weight of hydrogen bonds formed at the outer surface of<br />

the b<strong>in</strong>d<strong>in</strong>g site is a useful measure for reduc<strong>in</strong>g the number of false positive

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!