15.08.2018 Views

Abstracts Book - IMRC 2018

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

• SD1-O020 Invited Talk<br />

DATA-DRIVEN MOLECULAR ENGINEERING OF SOLAR-POWERED<br />

WINDOWS<br />

Jacqueline Cole 1,2<br />

1 ISIS Facility, STFC Rutherford Appleton Laboratory, United Kingdom, Channel Islands & Isle of<br />

Man. 2 University of Cambridge, Physics, United Kingdom, Channel Islands & Isle of Man.<br />

Large-scale data-mining workflows are increasingly able to predict successfully<br />

new chemicals that possess a targeted functionality. The success of such<br />

materials discovery approaches is nonetheless contingent upon having the right<br />

data source to mine, adequate supercomputing facilities and workflows to<br />

enable this mining, and algorithms that suitably encode structure-function<br />

relationships as data-mining workflows which progressively short list data<br />

toward the prediction of a lead material for experimental validation.<br />

This talk describes how to met these data science requirements via a large-scale<br />

data-mining case study that aims to discover new materials for solar-powered<br />

windows. In particular, the presentation shows how to auto-generate large<br />

material databases of photovoltaic-relevant experimental information<br />

documents, using natural language processing and machine learning, via our<br />

ChemDataExtractor tool [1,2]. Machine learning is then employed to populate<br />

any missing experimental data.<br />

A workflow that executes large-scale electronic structure calculations to afford<br />

a computational counterpart to these experimental data is then described.<br />

These wavefunction calculations are used to extend knowledge beyond<br />

experiment. The resulting large database of chemical structures and their<br />

optical properties is then mined for materials discovery using algorithms that<br />

are encoded forms of structure-function relationships. These molecular design<br />

rules progressively filter the parent set of chemicals until a lead candidate<br />

appears, which is experimentally validated.<br />

References:<br />

[1] M. C. Swain, J. M. Cole, ChemDataExtractor: A Toolkit for Automated<br />

Extraction of Chemical Information the Scientific Literature, J. Chem. Inf.<br />

Model., 2016, 56 (10), pp 1894–1904.<br />

[2] www.chemdataextractor.org

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!