NanoChef: AI Framework for Simultaneous Optimization of Synthesis Sequences and Reaction Conditions at Autonomous Laboratories

NanoChef is an AI framework for virtual experimentation and autonomous materials discovery. With support for MatBERT embeddings, neural surrogate modeling, and high-throughput virtual experiments, NanoChef enables intelligent synthesis exploration of complex design spaces via synthesis order and reaction conditions optimization. Whether you're optimizing synthesis conditions or testing surrogate models, NanoChef provides a modular and extensible platform for scientific automation.

📦 Installation

Requirements

Python 3.9+
See requirements.txt for full dependency list.

NanoChef Setup

git clone https://github.com/KIST-CSRC/NanoChef.git
cd NanoChef
conda create -n NanoChef python=3.9
conda activate NanoChef
pip install -r requirements.txt

Windows users can install using .bat file, as below:

install_package_with_git.bat

MatBERT² Setup

We tried to generate reagent vector using MatBERT, pretrained model.

To use MatBERT, download these files into a folder:

export MODEL_PATH="Your path"
mkdir $MODEL_PATH/matbert-base-cased $MODEL_PATH/matbert-base-uncased

curl -# -o $MODEL_PATH/matbert-base-cased/config.json https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_cased_30522_wd/config.json
curl -# -o $MODEL_PATH/matbert-base-cased/vocab.txt https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_cased_30522_wd/vocab.txt
curl -# -o $MODEL_PATH/matbert-base-cased/pytorch_model.bin https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_cased_30522_wd/pytorch_model.bin

curl -# -o $MODEL_PATH/matbert-base-uncased/config.json https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_uncased_30522_wd/config.json
curl -# -o $MODEL_PATH/matbert-base-uncased/vocab.txt https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_uncased_30522_wd/vocab.txt
curl -# -o $MODEL_PATH/matbert-base-uncased/pytorch_model.bin https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_uncased_30522_wd/pytorch_model.bin

Then some folder will generate, as below.

NanoChef/
...
├── matbert-base-cased/
├── matbert-base-uncased/
...

Olympus³ Setup

Our virtual experiments was based on Olympus environments, diverse and many virtual spaces.

Olympus can be installed with pip:

pip install olymp

The package can also be installed via conda:

conda install -c conda-forge olymp

Finally, the package can be built from source:

git clone https://github.com/aspuru-guzik-group/olympus.git
cd olympus
python setup.py develop

Then some folder will generate, as below.

NanoChef/
...
├── case_studies/
├── cifar/
├── docs/
├── examples/
├── my_new_emulator/
├── src/
...

🚀 Quick Start of Virtual Experiments

Generate a Job Script

The following table describes the configuration keys used in the virtual experiment JSON config file:

Key	Description
subject	Name or label of the virtual experiment run.
description	Optional description or notes about the run.
log_level	Verbosity of logging (e.g., DEBUG, INFO, WARNING).
model_name	Name of the model used (e.g., NN+Gamma).
total_surfaces	List of benchmark functions to be optimized (e.g., `[["Dejong", "HyperEllipsoid"], ["Dejong", "Denali"]]`).
num_variables	Number of input continuous variables (dimensions of the continuous variables).
initial_n_sample	Number of initial random samples before active learning begins.
n_points	Number of points to be divided range of each variables. (e.g. `n_points=101`, 100 grids in each variables)
batch_size	Number of samples selected in each batch.
ps_dim	Dimension of the positional encoding (e.g. `ps_dim=4`, each sequential vecotr is 4-dimensaionl vector).
output_dim	Output dimension of the prediction (usually 1 for scalar loss).
nn_n_hidden	Number of hidden neuron size in the neural network.
kappa_list	List of exploration-exploitation trade-off parameters (UCB (Upper Confidence Bounds) kappa values).
seed_num	Random seed for reproducibility.
reagent_list	List of chemical reagents to be used in the virtual experiment.
rgn_vec_onoff	Boolean flag to enable or disable reagent vector from MatBERT pretrained model.
n_search_epochs	Number of active search (optimization) iterations.
n_train_epochs	Number of epochs for training the surrogate model.
lr	Learning rate for training the neural network.
patience	Number of epochs to wait before early stopping if no improvement.

Run Examples

cpu version

python virtual_experiments.py --path config/20250628/test.json --cuda cpu

gpu version

python virtual_experiments.py --path config/20250628/test.json --cuda cuda:0

📁 Project Structure

NanoChef/
├── BaseUtils/
├── case_studies/
├── cifar/
├── config/
├── docs/
├── examples/
├── Log/
├── matbert-base-cased/
├── matbert-base-uncased/
├── my_new_emulator/
├── Sequence/
├── src/
├── install_package_with_git.bat
├── latin_hypercube_sampling_test.py
├── NanoChefModule.py
├── module_node.py
├── requirements.txt
├── virtual_experiments.py
├── virtual_space_image.py
├── virtual_test/
├── visualization_data.py
└── README.md

🔧 Key Modules

NanoChefModule.py: AI unit for recipe recommendations in real chemical experiments
module_node.py: Module for real chemical experiments, connected with OCTOPUS⁴
Sequence: Contains architecture of NanoChef
virtual_experiments.py: Closed-loop virtual experiment simulation
virtual_space_image.py: Visualization of latent variable space

📊 Visualization

Visualization of Virtual Spaces

You can generate to visualize virtual spaces using:

python virtual_space_image.py

This images of virtual space and spearman coefficient values of virtual space combinations can help to organize virtual space combinations for virtual experiments.

Visualization of the Performance of Virtual Experiments

You can visualize outputs of virtual experiments.

python visualization_data.py

def visualization_model_performance
def visualization_scatter
def create_gif

🚀 Quick Start of Real Chemcial Experiments

Activate Module Node of NanoChef

python module_node.py

Activate OCTOPUS⁴

python master_node.py

Login/Submit Job Script via OCTOPUS

qsub {jobscript_dirpath}/{jobscript_name} real

🙋 Author

Developed by Hyuk Jun Yoo at Korea Institute of Science and Technology (KIST)

🙏 Acknowledgments

MatBERT for pretrained materials-aware BERT models
Olympus for providing virtual spaces of virtual experiments
OCTOPUS for orchestrating module node as central management system

Reference

For more details, see the paper below. Please cite us if you are using our model in your research work:

[1] Hyuk Jun. Yoo., et al. "NanoChef: AI Framework for Simultaneous Optimization of Synthesis Sequences and Reaction Conditions in Autonomous Laboratories" ChemRxiv (2025).

[2] Trewartha, Amalie, et al. "Quantifying the advantage of domain-specific pre-training on named entity recognition tasks in materials science." Patterns 3.4 (2022).

[3] Häse, Florian, et al. "Olympus: a benchmarking framework for noisy optimization and experiment planning." Machine Learning: Science and Technology 2.3 (2021): 035021.

[4] Yoo, Hyuk Jun, et al. "OCTOPUS: operation control system for task optimization and job parallelization via a user-optimal scheduler." Nature communications 15.1 (2024): 9669.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NanoChef: AI Framework for Simultaneous Optimization of Synthesis Sequences and Reaction Conditions at Autonomous Laboratories

📦 Installation

Requirements

NanoChef Setup

MatBERT² Setup

Olympus³ Setup

🚀 Quick Start of Virtual Experiments

Generate a Job Script

Run Examples

📁 Project Structure

🔧 Key Modules

📊 Visualization

Visualization of Virtual Spaces

Visualization of the Performance of Virtual Experiments

🚀 Quick Start of Real Chemcial Experiments

Activate Module Node of NanoChef

Activate OCTOPUS⁴

Login/Submit Job Script via OCTOPUS

🙋 Author

🙏 Acknowledgments

Reference

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
BaseUtils		BaseUtils
Log		Log
Sequence		Sequence
config/20250628		config/20250628
virtual_test		virtual_test
.gitignore		.gitignore
Figure_workflow.PNG		Figure_workflow.PNG
NanoChefModule.py		NanoChefModule.py
README.md		README.md
install_package_with_git.bat		install_package_with_git.bat
latin_hypercube_sampling_test.py		latin_hypercube_sampling_test.py
module_node.py		module_node.py
requirements.txt		requirements.txt
virtual_experiments.py		virtual_experiments.py
virtual_space_image.py		virtual_space_image.py
visualization_data.py		visualization_data.py

KIST-CSRC/NanoChef

Folders and files

Latest commit

History

Repository files navigation

NanoChef: AI Framework for Simultaneous Optimization of Synthesis Sequences and Reaction Conditions at Autonomous Laboratories

📦 Installation

Requirements

NanoChef Setup

MatBERT2 Setup

Olympus3 Setup

🚀 Quick Start of Virtual Experiments

Generate a Job Script

Run Examples

📁 Project Structure

🔧 Key Modules

📊 Visualization

Visualization of Virtual Spaces

Visualization of the Performance of Virtual Experiments

🚀 Quick Start of Real Chemcial Experiments

Activate Module Node of NanoChef

Activate OCTOPUS4

Login/Submit Job Script via OCTOPUS

🙋 Author

🙏 Acknowledgments

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

MatBERT² Setup

Olympus³ Setup

Activate OCTOPUS⁴

Packages