12.07.2015 Views

QSAR & QSPR

QSAR & QSPR

QSAR & QSPR

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Quantitative Structure-Activity RelationshipsQuantitative Structure-Property-Relationships<strong>QSAR</strong> & <strong>QSPR</strong>Alexandre VarnekFaculté de Chimie, ULP, Strasbourg, FRANCE


History of <strong>QSAR</strong>


Discoverer of the Periodic Table —an early “Chemoinformatician”DmitryMendeleév(1834 – 1907)• Russian chemist who arranged the 63 known elements into a periodictable based on atomic mass, which he published in Principles ofChemistry in 1869. Mendeléev left space for new elements, andpredicted three yet-to-be-discovered elements: Ga (1875), Sc (1879)and Ge (1886).


Periodic TableChemical properties of elements graduallyvary along the two axis


History of <strong>QSAR</strong>• 1868, D. Mendeleev – The Periodic Table of Elements• 1868, A. Crum-Brown and T.R. Fraser – formulated a suggestion thatphysiological activity of molecules depends on their constitution:Activity = F(structure)They studied a series of quaternized strychnine derivatives, some ofwhich possess activity similar to curare in paralyzing muscle.• 1869, B.J. Richardson – narcotic effect of primary alcohols varies inproportion to their molecular weights.


History of <strong>QSAR</strong>• 1893, C. Richet has shown that toxicities of some simpleorganic compounds (ethers, alcohols, ketones) wereinversely related to their solubility in water.• 1899, H. Meyer and 1901, E. Overton have found variationof the potencies of narcotic compounds with LogP.• 1904, J. Traube found a linear relation between narcosisand surface tension.


History of <strong>QSAR</strong>• 1937, L.P. Hammett studied chemical reactivity ofsubstituted benzenes:Hammett equation,Linear Free Energy Relationship (LFER)• 1939, J. Fergusson formulated a concept linking narcoticactivity, logP and thermodynamics.• 1952- 1956, R.W. Taft devised a procedure for separatingpolar, steric and resonance effects.


History of <strong>QSAR</strong>• 1964, C. Hansch and T. Fujita: the biologist’s Hammettequation.• 1964, Free and Wilson, <strong>QSAR</strong> on fragments.• 1970s – 1980s – development of 2D <strong>QSAR</strong> (descriptors,mathematical formalism).• 1980s – 1990s, development of 3D <strong>QSAR</strong>(pharmacophores, CoMFA, docking).• 1990s – present, virtual screening.


1934 - HammettR H CH 3OCH 3F Cl NO 2ortho 6.27 12.3 8.06 54.1 11.4 671meta 6.27 5.35 8.17 13.6 14.8 32.1para 6.27 4.24 3.38 7.22 10.5 37.0


1934 - HammettSubstituentMetaσParaSubstituentMetaσParaO-0.708 -1.00F+0.337 +0.062OH+0.121 -0.37Cl+0.373 +0.227OCH 3+0.115 -0.268CO 2H+0.355 +0.406NH 2-0.161 -0.660CH 3-0.069 -0.170COCH 3+0.376 +0.502CF 3+0.43 +0.54(CH 3) 3Si-0.121 -0.072SO 2Ph+0.61 +0.70C 6H 5+0.06 -0.01NO 2+0.710 +0.778HSH0.000 0.000 + )+0.88 +0.82N(CH 3 3+0.25 +0.15+ +1.76 +1.91N 2SCH 3+0.15 0.00 +S(CH 3) 2+1.00 +0.90


Steric effectsTaft quantified the steric (spatial) effects using the hydrolysis of esters:Here, the size of R affects the rate of reaction by blocking nucleophilic attack by water.In this case, the steric effects were quantified by the Taft parameter E s :k is the rate constant for ester hydrolysis. This expression is analogous to the Hammett equation.


E sValues for Various SubstituentsH Me Pr t-Bu F Cl Br OH SH NO 2C 6H 5CN NH 20.0 -1.24 -1.60 -2.78 -0.46 -0.97 -1.16 -0.55 -1.07 -2.52 -3.82 -0.51 -0.61Compare some extreme values:H 0.00 the reference substituent in the Taft equationMe -1.24: little steric resistance to hydrolysist-Bu -2.78 : large resistance to hydrolysisNote: H is usually used as the reference substituent (E s(0)), but sometimes when another group,such as methyl (Me) is used as the reference, as in the chemical equation above, the valuebecomes 1.24.


Steric effectsE smay be used in other chemical reactions and to explain biological activities, forexample the hydrolysis of inhibitors of acetylcholine esterase.Organophosphates must be hydrolysed to be active and it is observed that their biologicalactivity is directly related to the Taft steric parameter E Sfor the substituent R by the equation:


Octanol/water partition coefficientUsually, logP instead of P is usedlogP > 0, the compound prefers hydrophobic (unpolar) medialogP > 0, the compound prefers polar media


Biological activity as a function of logP


Hansch AnalysisBiological Activity = log1/CC, drug concentration causes EC50, GI50, etc.EL (electronic descriptor): σ Hammett constant ( σ m , σ p, σ p0 , σ p+ , σ p- ,R, F )HPh (hydrophobicity descriptor):partition coeff.Biological Activity = f (EL, ST, HPh) ) + constantπ hydrophobic subst. constant,ST (steric descriptor): Taft steric constanthydrophobic subst. constant, log P octanol/waterlog1/C = a ( log P ) 2 + b log P + ρσ + δE s + CHansch, C.; Fujita, T. J. Am. Chem. Soc., 1964, 86, 1616.


Hansch AnalysisBiological Activity = f (Physicochemical properties ) + constant• Physicochemicalproperties can be broadlyclassiied into three generaltypes:• Electronic• Steric• Hydrophobic


Descriptors


Quantitative Structure Activity Relationship (<strong>QSAR</strong>)Quantitative structure-activity relationships correlate, within congeneric series ofcompounds, their chemical or biological activities, either with certain structuralfeatures or with atomic, group or molecular descriptors.MolecularStructureACTIVITIESRepresentationFeature Selection &MappingDescriptorsKatiritzky, A. R. ; Lovanov, V. S.; Karelson, M. Chem. Soc. Rev. 1995, 24, 279-287


Definition of molecular descriptorThe molecular descriptor is the final result of a logicand mathematical procedure which transformschemical information encoded within a symbolicrepresentation of a molecule into a useful number orthe result of some standardized experiment.Roberto Todeschini and Viviana Consonni


A complete description of all the moleculardescriptors is given in:Handbook of Molecular DescriptorsRoberto Todeschini and Viviana ConsonniMethods and Principles in Medicinal ChemistryVolume 11Edited by:H. KubinyiR. Mannholdxx. TimmermannWILEY - VCH, Mannheim, Germany - 2000


Descriptors from Codessa ProDescriptor FamiliesTopologicalFragmentsReceptor surfaceStructuralInformation-contentSpatialElectronicThermodynamicConformationalQuantum mechanicalDescriptors - calculablemolecular attributes thatgovern particularmacroscopic propertiesProductsPlus Molecular andQuantum Methods


Molecular DescriptorsClassification based on the dimensionality of structure presentation• 1D (atom counts, MW, number of functional groups, …)• 2D (topological indices, BCUT, TPSA, Shannon enthropy, …)• 3D (geometrical parameters, molecular surfaces, parameterscalculated in quantum chemistry programs, …)


Molecular Descriptors1D


Constitutional descriptors• number of atoms• absolute and relative numbers of C, H, O, S, N, F, Cl, Br, I, Patoms• number of bonds (single, double, triple and aromatic bonds)• number of benzene rings, number of benzene rings divided by thenumber of atoms• molecular weight and average atomic weight• Number of rotatable bonds (All terminal H atoms are ignored)• Hbond acceptor - Number of hydrogen bond acceptors• Hbond donor - Number of hydrogen bond donorsThese simple descriptors reflect only the molecular composition of thecompound without using the geometry or electronic structure ofthe molecule.


Molecular Descriptors2D


Topological DescriptorsDescriptors based on the molecular graph representation are widely used in<strong>QSPR</strong>, <strong>QSAR</strong> studies because they help to differentiate the moleculeslesaccording mostly to their size, degree of branching, flexibility and overallshape.


TI based on the adjacency matrix• Total adjacency index: A = (1/2)• For G 1and G 2, A = 5.n∑i, j=1• This TI can only distinguish between structures having differentnumber of cycles (for cyclohexane A = 6).aij


TI based on the adjacency matrix :Zagreb group indicesn∑ ∑δiδji=12•M 1= δi M 2 =where the vertex degree δ ιis a number of σ bonds involving atom i excludingbonds to H atoms.Zagreb group indices were introduced to characterize branching


Zagreb group indicesn∑ i ∑δiδji=12M 1= δ M 2 =M 1(G 1) = 4*1 2 +2*3 2 = 22M 2(G 1) = 4*(1*3) +1*(3*3) = 21M 1(G 2) = 2*1 2 +4*2 2 = 18M 1(G 2) = 2*(1*2) +3*(2*2) = 16Randić’s molecular connectivity indexRandic introduced a connectivity index similar to M 2χ R=∑( δδ)−i j1/2M. Randić, J. Am. Chem. Soc., 97, 6609 (1975).


TI based on the Distance Matrix:the Wiener IndexThe entry d ijof the distance matrix indicates the number of edges in theshortest path between vertices i and j.The Wiener index (the first TI !) accounts for the branching:W(G1) = 29 W(G2) = 35Reference: H. Wiener, J. Am. Chem. Soc., 69, 17 (1947)


TPSA - Topological Polar Surface AreaPeter Ertl, Bernhard Rohde, and Paul Selzer, J. Med. Chem. 2000, 43, 3714-3717


TPSA - Topological Polar Surface AreaN ( fragm3D −PSA= ∑ .ii = 1)nc(fragment)i


TPSA - Topological Polar Surface Area3D PSA vs TPSA for 34 810 molecules from theWorld Drug Index


Geometrical descriptors•Moments of inertia- rigid rotator approximation- The moments of inertia characterize the mass distribution in the e molecule.•Shadow indices 1- Surface area projectionsI= ∑im id i2Radius of gyration2 2 2( x + y z )⎛i i+Rog = ⎜∑ ⎝ NN:number of atoms⎞⎟⎠x, y,z : the atomic coordinatesirelative to the center ofmassArea– - Molecular surface area descriptor– - Describes the van der Waals area of molecule– - related to binding, transport, and solubility1. Rohrbaugh, R.H., Jurs, P.C., Anal.Chim. Acta, 1987. 199, , 99-109.


Molecular Descriptors3D


Steric parameters• Length-toto-breadth ratio : L/B 1Molecular thickness• Molecular thicknessBLLB• Ovality 2(ratio of the actual surface area andminimum surface )• Molecular volumeovality =Surface area⎡⎢ ⎛ 3×volumn ⎞4π⎜⎟⎢ ⎝ 4π⎠⎣23⎤⎥⎥⎦• Sterimol parameters 3B1L axisL axisB1• Taft steric parameter E sB41. Janini, G.M.; Johnston, K.; Zielinski, W. L. Anal.Chem. 1975, 47, , 670.2. Verloop, A.; Tipker, J. In Biological Activity andChemical Structure, , Buisman, J. A. K.(editors),Elsevier, Amsterdam, Netherlands, 1977, p63.3. Kourounakis, A.; Bodor, N. Pharm. Res. 1995, 12(8),1199.B2B3


Quantum Chemical Descriptors• Quantitative values calculated in QUANTUM MECHANICS(semi-empirical, empirical, HF Ab Initio or DFT ) calculations- Atomic charges (quant)- Atomic charges- LUMO - Lowest occupied molecular orbital energy– HOMO - Highest occupied molecular orbital energy– DIPOLE - Dipole moment• - Components of dipole moment along inertia axes (D x , D y , D z )– Hf - Heat of formation– Mean Polarizability - α = 1/3(α xx +α yy +α zz )– EA – Electron Affinity– IP – Ionization Potential– ΔE – Energy of Protonation– Electrostatic Potential -V ( r)=∑AZA−R − rA∫ρ(r')dr'r'−r


Lipophilic Descriptors(2D and 3D)


Lipophilic DescriptorslogP(octanol-water), logP(alkane-water), logP(chloroform-water), logP(dichloroethane/water)Octanol-water partition coefficient• Hansch-Leo method (ClogP)• Rekker's methodlog P=N∑n=1anfn+M∑m=1bmFm•Ghose-Grippen Grippen method(calculated logP based on summing contributions of atom types)•Molecular lipophilicity potential (MLP)MLP(j)=nii= 1 1+∑fdijThe MLP describe how lipophilicity is distributed all over the different dparts of amolecule(lipophilicity maps and determination of hydro and lipophilic regions ofa molecule)


Lipophilic Descriptors


Some LogP o/w Extremes in TherapyHNH 2NHNarginine-4.2HNH 2OHOHOXOHOOH OHOOHHOOXinuline-3.7OHOOHHOHOHOHOsucrose-3.7OOOHOOHOHOHOHNNNClClClClFOOClClClFClOHHOClpermethrin6.5clopimozide7.1hexachlorophen7.54


What do these Drugs have in Common?ClClIrsogladineLogP o/w = 1.97NNNH 2NNH 2HOOClClOClHNNHChloroformLogP o/w = 1.97OOSecobarbitalLogP o/w = 1.97OAcetyldigitoxineLogP o/w = 1.97OOOOHHHOHOOHOOOOHOOHNHHHOONHTrandolaprilLogP o/w = 1.97HOH


3D Hydrophobicityhydrophobic hydrophilicAll molecules have the same logP ~1.5, but different 3D MLP pattern.


Example of oral administration:– Drug is exposed to a large varietyof pH values:• Saliva pH 6.4• Stomach pH 1.0 – 3.5• Duodenum pH 5 – 7.5• Jejunum pH 6.5 – 8• Colon pH 5.5 – 6.8• Blood pH 7.4– „Liver-first-pass-effect“www.3dscience.com


Lipophilic DescriptorsDpHsystem=fNN• P+fII• P• Log D• Log P N : logP of the neutral form• Log P I : logP of the ionized form


logD – The Calculation• LogD may simply be calculated from predicted logPand pK a of the singly ionized species at certain pH:• For acids:logD (pH) = logP – log[1 + 10 (pH - pK a)]• For bases:logD (pH) = logP – log[1 + 10 (pK a -pH)]


Fragment DescriptorsDescriptors: Cl, amide, COOH, Br, PhenylCl = 1amide = 1COOH = 1Br = 0Phenyl = 0ClHNOClHNNOOONOSO


ISIDA Fragment descriptorsHN HType of FragmentsHNNNNHHI. SequencesII. Augmented AtomsC-N=C-HC-N=CN=C-NC-NN=CC-HsequenceI(AB, 2-4)Atoms+Bonds2 to 4 atoms


ISIDA Fragment descriptorsHN HType of FragmentsHNNNNHHII(A)I. SequencesII. Augmented Atoms(no hybridization)II(Hy)(hybridization of neighboursis taken into account)


Calculation of DescriptorsDataSetOC-C-C-C-C-CC=OC-C-C-N-C-CC-C-C-NC-N-C-C*CN0 10 1 5 0ON0 8 1 4 0ON0 4 1 2 4Etc.ISIDAFRAGMENTORthe Pattern matrix


-0.222+0.973-0.066PATTERN MATRIXPROPERTY VALUESLEARNING STAGEBuilding of modelsVALIDATION STAGE<strong>QSAR</strong> models filtering ->selection of the most predictive ones<strong>QSAR</strong> models


Example : linear <strong>QSPR</strong> modelak+0 ∑i=Property Propriété =1ai .DPROPERTY calc= -0.36 * N C-C-C-N-C-C+ 0.27 * N C=O+ 0.12 * N C-N-C*C+ …i


Software


DRAGONThe software DRAGON calculates 1664 moleculardescriptors divided in 20 blocks


CODESSA Pro calculate a large variety of molecular descriptors on the basis of the 3Dgeometrical structure and/or quantum-chemical parameters; develop (multi)linear and non-linear <strong>QSPR</strong>


ISIDA program calculates fragment descriptors; develop (multi)linear and non-linear <strong>QSPR</strong> models

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!