Towards Automated Materials Characterization: Non-negative Matrix Factorization and Data Clustering Sasaank Bandi, Simon J. L. Billinge Solutions to many of the challenges society faces will require new materials. These materials will have to exhibit properties that make them suited to function in conditions that previously were unimaginable. Technological developments have led to experimental tools that can measure these properties in a fraction of a second, inevitably leading to the production of large volumes of data [1, 2]. Since a material’s properties are determined by its structure, x-ray diffraction is one of these powerful tools in screening materials for extraordinary properties. To speed up this screening process, it is imperative to develop techniques to analyze diffraction data quickly and autonomously. Two such techniques have been studied during the course of this project to facilitate the rapid characterization of diffraction data: Agglomerative Hierarchical Clustering [3] (AHC) and Non-negative Matrix Factorization [4] (NMF). Both have been shown to have utility in various circumstances. In the case of AHC, large volumes of data are grouped into k clusters depending on their similarity to each other. A researcher may then analyze a few data points from each cluster, greatly saving time over analyzing the entire set, whilst gaining an good understanding of the dataset as a whole. An example of the results of a clustering experiment is illustrated in Fig. 1 On the other hand, NMF breaks down the dataset into representative elements and determines how to linearly combine these elements to recreate every data point in the dataset. This proves to be useful when analyzing a dataset in which one structure is changing into another (due to heat, pressure, etc.). In this case, NMF would output a representation of both structures, and provide information about when the transformation occurred. Both methods were tested on simulated datasets of 1000 randomly generated PDFs from three distinct crystal structures as well as experiments capturing the growth of Silver Nanoparticles and the temperature driven structural changes in methylammonium lead bromide (MAPbBr3). Both performed well on the simulated dataset, being able to sort the data into three structural categories with less than 5% error in a matter of minutes. In the experimental dataset, the clustering method was not able to sort the data into stable clusters. However, NMF was able to identify the structures that were present and how they evolved as a function of the independent variable, allowing the experimenter, in principle, to follow the experiment without any prior data analysis. [1] P. J. Chupas, K. W. Chapman, and P. L. Lee, Applications of an amorphous siliconbased area detector for high-resolution, highsensitivity and fast time-resolved pair distribution function measurements, J. of Appl. Crystallogr. 40, 463 (2007). [2] P. J. Chupas, M. F. Ciraolo, J. C. Hanson, and C. P. Grey, In situ x-ray diffraction and solid-state NMR study of the fluorination of Al2O3 with HCF2Cl, J. Am. Chem. Soc. 123, 1694 (2001). [3] G. J. Szekely and M. L. Rizzo, Hierarchical clustering via joint between-within distances: Extending ward’s minimum variance method, J. Classification 22, 151 (2005). [4] D. D. Lee and H. S. Seung, Learning the parts of objects by non-negative matrix factorization, Nature 401, 788 (1999). 9
10
- Page 2 and 3: To view projects from previous year
- Page 4 and 5: APPLIED PHYSICS AND APPLIED MATH To
- Page 6 and 7: EARTH AND ENVIRONMENTAL ENGINEERING
- Page 8 and 9: Shred-It Racquel Glickman, Riley Gr
- Page 12 and 13: Grain Size and Grain Boundary Chara
- Page 14 and 15: Effect of Microstructure on the Res
- Page 16 and 17: Identifying Diels-Alder Reactions f
- Page 18 and 19: BIOMEDICAL ENGINEERING 17
- Page 20 and 21: PneumoTech Janice Chung, Tiffany Li
- Page 22 and 23: XZAMN Nicole Boyd, Michelle Feely,
- Page 24 and 25: CHEMICAL ENGINEERING 23
- Page 26 and 27: CIVIL ENGINEERING \ 25
- Page 28 and 29: The Prism Salomon Dayan, Lorenzo Fe
- Page 30 and 31: Amazon’s HQ2 Isabel Neiva (team c
- Page 32 and 33: EARTH AND ENVIRONMENTAL ENGINEERING
- Page 34 and 35: Figure 1: Chicago LST Weighted Sums
- Page 36 and 37: Water Reclamation and Salinity Abat
- Page 38 and 39: ELECTRICAL ENGINEERING 37
- Page 40 and 41: References 1) Gonzalez, R. and Wood
- Page 42 and 43: Frequency Stabilization and Externa
- Page 44 and 45: 43
- Page 46 and 47: Semi-Parametric Equalizer using 20
- Page 48 and 49: Modular Micropower Electronics for
- Page 50 and 51: Smart Walker and Boot Mohammad Khoj
- Page 52 and 53: 51
- Page 54 and 55: Football Simulator 2019 Thomas Meca
- Page 56 and 57: MECHANICAL ENGINEERING 55
- Page 58 and 59: InstaSorg Quincy Delp, Yifei Tian,
- Page 60 and 61:
59
- Page 62 and 63:
Shred-It Riley Greene, Racquel Glic
- Page 64 and 65:
The Ark By AütoFlöt Kirsten McNei
- Page 66 and 67:
FSAE EV Systems Integration Team FA
- Page 68 and 69:
Shoe Tying Robot Arturo Mori, Cesar
- Page 70 and 71:
CulpeoWASH Team Hydraulic Foxtrot:
- Page 72 and 73:
VTOAL Aircraft The Wright Siblings:
- Page 74:
73