25.12.2014 Views

Best practices for chemical data curation and QSAR model ...

Best practices for chemical data curation and QSAR model ...

Best practices for chemical data curation and QSAR model ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Why <strong>model</strong>s may fail<br />

• Incorrect <strong>data</strong> (structures <strong>and</strong> activities) in the <strong>data</strong>set<br />

• Modeling set is too small<br />

• No external validation<br />

• Incorrect selection of an external test set<br />

• Incorrect division of a <strong>data</strong>set into training <strong>and</strong> test sets<br />

• Incorrect measure of prediction accuracy<br />

• Insufficient statistical criteria to estimate predictive<br />

power of <strong>model</strong>s<br />

• Lack or incorrect definition of applicability domain<br />

• No Y-r<strong>and</strong>omization test (overfitness)<br />

• Presence of leverage (structure) <strong>and</strong> activity outliers<br />

Also, see Dearden JC, Cronin MT, Kaiser KL. How not to develop a quantitative structure-activity or<br />

structure-property relationship (<strong>QSAR</strong>/QSPR). SAR <strong>QSAR</strong> Environ Res. 2009;20(3-4):241-66

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!