NanoChef: AI Framework for Simultaneous Optimization of Synthesis Sequences and Reaction Conditions at Autonomous Laboratories
NanoChef is an AI framework for virtual experimentation and autonomous materials discovery. With support for MatBERT embeddings, neural surrogate modeling, and high-throughput virtual experiments, NanoChef enables intelligent synthesis exploration of complex design spaces via synthesis order and reaction conditions optimization. Whether you're optimizing synthesis conditions or testing surrogate models, NanoChef provides a modular and extensible platform for scientific automation.
- Python 3.9+
- See
requirements.txt
for full dependency list.
git clone https://github.com/KIST-CSRC/NanoChef.git
cd NanoChef
conda create -n NanoChef python=3.9
conda activate NanoChef
pip install -r requirements.txt
Windows users can install using .bat file, as below:
install_package_with_git.bat
MatBERT2 Setup
We tried to generate reagent vector using MatBERT, pretrained model.
To use MatBERT, download these files into a folder:
export MODEL_PATH="Your path"
mkdir $MODEL_PATH/matbert-base-cased $MODEL_PATH/matbert-base-uncased
curl -# -o $MODEL_PATH/matbert-base-cased/config.json https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_cased_30522_wd/config.json
curl -# -o $MODEL_PATH/matbert-base-cased/vocab.txt https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_cased_30522_wd/vocab.txt
curl -# -o $MODEL_PATH/matbert-base-cased/pytorch_model.bin https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_cased_30522_wd/pytorch_model.bin
curl -# -o $MODEL_PATH/matbert-base-uncased/config.json https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_uncased_30522_wd/config.json
curl -# -o $MODEL_PATH/matbert-base-uncased/vocab.txt https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_uncased_30522_wd/vocab.txt
curl -# -o $MODEL_PATH/matbert-base-uncased/pytorch_model.bin https://cedergroup-share.s3-us-west-2.amazonaws.com/public/MatBERT/model_2Mpapers_uncased_30522_wd/pytorch_model.bin
Then some folder will generate, as below.
NanoChef/
...
├── matbert-base-cased/
├── matbert-base-uncased/
...
Olympus3 Setup
Our virtual experiments was based on Olympus environments, diverse and many virtual spaces.
Olympus
can be installed with pip
:
pip install olymp
The package can also be installed via conda
:
conda install -c conda-forge olymp
Finally, the package can be built from source:
git clone https://github.com/aspuru-guzik-group/olympus.git
cd olympus
python setup.py develop
Then some folder will generate, as below.
NanoChef/
...
├── case_studies/
├── cifar/
├── docs/
├── examples/
├── my_new_emulator/
├── src/
...
The following table describes the configuration keys used in the virtual experiment JSON config file:
Key | Description |
---|---|
subject | Name or label of the virtual experiment run. |
description | Optional description or notes about the run. |
log_level | Verbosity of logging (e.g., DEBUG, INFO, WARNING). |
model_name | Name of the model used (e.g., NN+Gamma). |
total_surfaces | List of benchmark functions to be optimized (e.g., [["Dejong", "HyperEllipsoid"], ["Dejong", "Denali"]] ). |
num_variables | Number of input continuous variables (dimensions of the continuous variables). |
initial_n_sample | Number of initial random samples before active learning begins. |
n_points | Number of points to be divided range of each variables. (e.g. n_points=101 , 100 grids in each variables) |
batch_size | Number of samples selected in each batch. |
ps_dim | Dimension of the positional encoding (e.g. ps_dim=4 , each sequential vecotr is 4-dimensaionl vector). |
output_dim | Output dimension of the prediction (usually 1 for scalar loss). |
nn_n_hidden | Number of hidden neuron size in the neural network. |
kappa_list | List of exploration-exploitation trade-off parameters (UCB (Upper Confidence Bounds) kappa values). |
seed_num | Random seed for reproducibility. |
reagent_list | List of chemical reagents to be used in the virtual experiment. |
rgn_vec_onoff | Boolean flag to enable or disable reagent vector from MatBERT pretrained model. |
n_search_epochs | Number of active search (optimization) iterations. |
n_train_epochs | Number of epochs for training the surrogate model. |
lr | Learning rate for training the neural network. |
patience | Number of epochs to wait before early stopping if no improvement. |
- cpu version
python virtual_experiments.py --path config/20250628/test.json --cuda cpu
- gpu version
python virtual_experiments.py --path config/20250628/test.json --cuda cuda:0
NanoChef/
├── BaseUtils/
├── case_studies/
├── cifar/
├── config/
├── docs/
├── examples/
├── Log/
├── matbert-base-cased/
├── matbert-base-uncased/
├── my_new_emulator/
├── Sequence/
├── src/
├── install_package_with_git.bat
├── latin_hypercube_sampling_test.py
├── NanoChefModule.py
├── module_node.py
├── requirements.txt
├── virtual_experiments.py
├── virtual_space_image.py
├── virtual_test/
├── visualization_data.py
└── README.md
NanoChefModule.py
: AI unit for recipe recommendations in real chemical experimentsmodule_node.py
: Module for real chemical experiments, connected with OCTOPUS4Sequence
: Contains architecture of NanoChefvirtual_experiments.py
: Closed-loop virtual experiment simulationvirtual_space_image.py
: Visualization of latent variable space
You can generate to visualize virtual spaces using:
python virtual_space_image.py
This images of virtual space and spearman coefficient values of virtual space combinations can help to organize virtual space combinations for virtual experiments.
You can visualize outputs of virtual experiments.
python visualization_data.py
- def visualization_model_performance
- def visualization_scatter
- def create_gif
python module_node.py
Activate OCTOPUS4
python master_node.py
qsub {jobscript_dirpath}/{jobscript_name} real
Developed by Hyuk Jun Yoo at Korea Institute of Science and Technology (KIST)
- MatBERT for pretrained materials-aware BERT models
- Olympus for providing virtual spaces of virtual experiments
- OCTOPUS for orchestrating module node as central management system
For more details, see the paper below.
Please cite us if you are using our model in your research work: