We develop computational methods to enable predictive materials synthesis, thus accelerating their design beyond screening. Using a range of tools - from databases to machine learning - we propose solutions in energy, sustainability, and AI.
Interpreting spectroscopy data is a critical bottleneck in automating chemical research and industrial characterization. Particularly within infrared (IR) spectroscopy, identifying compounds in complex, liquid-phase chemical mixtures largely relies on expert knowledge, as variable peak assignment, broadening, and shifts hinder data-driven methods. We showed that an algorithmic approach can identify components in both simulated and experimental mixture spectra with high accuracy despite nonlinearities in liquid-phase IR data. We applied the method to automatically interpret IR spectra in a large dataset of simulated liquid-phase IR, as well as experimental spectra, correctly identifying the components of nearly all samples within a blind study. This work provides tools and data to advance automated chemical laboratories through algorithmic interpretation of liquid-phase IR spectra of mixtures. [preprint]
Generative models show ample promise for materials design, but face severe limitations in the amorphous materials space due to their complex structures. We developed a denoising diffusion framework that generates reliable atomistic structures across diverse amorphous systems and processing conditions while being up to three orders of magnitude faster than classical molecular dynamics simulations. Our model enables a range of applications in amorphous materials research, such as performing fracture simulations with large, slow-cooled structures, generating mesoporous structures, and augmenting experimental datasets with synthetic data. This work provides a roadmap on how to use, validate, and develop generative models for amorphous materials. [paper] [blog post] [data] [code]
Machine learning interatomic potentials (MLIPs) can bypass limitations of density functional theory (DFT) approaches regarding computational cost and scaling, but MLIPs often require heuristics, from training set selection to uncertainty quantification (UQ). By proposing an atomistic information theory, we showed that the information entropy from a distribution of local descriptors can be used in a range of problems in atomistic simulations, such as explaining trends in MLIP errors, rationalizing dataset analysis/compression, providing a robust UQ estimate for ML-driven simulations, and detecting outliers in atomistic simulations. We also proposed parallels between thermodynamic and information entropy and connect our information-theoretical approach to nucleation and growth. Our approach was demonstrated in a number of applications, and offers a general perspective on how materials theory, computation, and machine learning can be used to solve a range of problems in materials science. [paper] [code] [data]