Mutual Information-Informed Novelty Estimation Of Materials Along Chemical And Structural Axes

A parameter-free method for estimating material novelty is introduced, leveraging mutual information to analyze inter-material distances along chemical and structural axes. This approach derives data-driven weight functions from the mutual information profile, enabling the computation of quantitative novelty scores based on local density without requiring preset cutoffs. The methodology is validated on diverse materials datasets, demonstrating its effectiveness in identifying and differentiating chemical and structural novelty to guide materials discovery.

For detailed methodology, validation results, and theoretical background, see the peer-reviewed paper: "Mutual Information-Informed Novelty Estimation Of Materials Along Chemical And Structural Axes" published in Digital Discovery (2025).

Installation & Setup

# Clone the repository
git clone https://github.com/AndrewFalkowski/MINov.git
cd MINov

# Create and activate conda environment
conda env create -f environment.yml
conda activate MINOV

Dependencies

All required libraries (numpy, pandas, scipy, scikit-learn, matplotlib, matminer, pymatgen) and their versions are specified in environment.yml.

Usage

Core functionality is provided through the scripts contained in the MINOV folder. Novelty can be copmuted over a dataframe of pymatgen structure objects by calling the compute_MI_novelty function as shown in the code snippet below. Variables isolating specific distance metrics, loading precomputed internal or external distance metrics, and specifying saving paths are provided.

from MINOV.novelty import compute_MI_novelty

# Where 'mat_data' is a pandas DataFrame with 'structure' and 'formula' columns
# containing pymatgen structure objects and formula strings, respectively

data, mi_data = compute_MI_novelty(
    data = mat_data, # df of pymatgen structure objects and formulae
    compute_metrics = ['lostop'], # list of distance metrics to compute
    precomputed_metrics={"elmd": "perovskite_dataset_elmd_dm.npy"}, # load precomputed
    data_dir="precomputed", # path to folder with precomputed metrics
    data_prefix="perovskite_dataset", # prefix for labeling purposes
)

# outputs:
# data - a df containing material information and computed densities for each metric
# mi_data - a dictionary with computed MI profile data for each metric

Further usage examples are available within the jupyter notebooks described below.

Demonstrations on Materials Datasets

Three Jupyter notebooks demonstrate the application of this method:

perovskite_novelty.ipynb: Shows the method applied to a controlled dataset containing three distinct perovskite cyrstal systems: cubic, tetragonal, and orthorhombic. The data for this notebook is available in data/perovskite_dataset.
diverse_novelty.ipynb: Demonstrates the method using a structurally diverse dataset of materials with varying degrees of similarity. Shows how the method distinguishes between different types of novelty. The data for this notebook is available in data/diverse_dataset.
Li_novelty.ipynb: Applies the method to analyze some lithium-containing compounds from the GNOME dataset relative to known materials in the Materials Project database. The data for this notebook is available in data/MP_Li_dataset and data/GNOME_Li_dataset.

NOTE: The GNOME and MP datasets are pulled from v2023.11.1 of the database. As the database has since changed, we provide all structure files used in the analysis in the /data folder.

Citing MINov

@article{falkowski2025mutual,
  title={Mutual Information Informed Novelty Estimation of Materials Along Chemical and Structural Axes},
  author={Falkowski, Andrew R and Sparks, Taylor D},
  journal={Digital Discovery},
  year={2025},
  publisher={RSC}
}

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
MINOV		MINOV
data		data
figures		figures
precomputed		precomputed
style		style
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Li_novelty.ipynb		Li_novelty.ipynb
README.md		README.md
diverse_novelty.ipynb		diverse_novelty.ipynb
methods_analysis.ipynb		methods_analysis.ipynb
perovskite _novelty.ipynb		perovskite _novelty.ipynb
requirements.yml		requirements.yml
supplementary_methods_analysis.ipynb		supplementary_methods_analysis.ipynb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mutual Information-Informed Novelty Estimation Of Materials Along Chemical And Structural Axes

Installation & Setup

Dependencies

Usage

Demonstrations on Materials Datasets

Citing MINov

About

Uh oh!

Releases 1

Languages

License

AndrewFalkowski/MINov

Folders and files

Latest commit

History

Repository files navigation

Mutual Information-Informed Novelty Estimation Of Materials Along Chemical And Structural Axes

Installation & Setup

Dependencies

Usage

Demonstrations on Materials Datasets

Citing MINov

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Languages