Skip to content

The official codebase of ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization

License

Notifications You must be signed in to change notification settings

xo28/ADMIRE-BayesOpt

Repository files navigation

ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization

Official implementation of ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization. Please feel free to email us or raise an issue with this repository and we'll get back to you as soon as possible.

CC BY-NC-SA 4.0

Setup

  1. Create a virtual environment (we use conda)
  2. Activate the virtual environment
  3. Install the repository
    conda env create --name admire_bayesopt
    conda activate admire_bayesopt
    pip install -r requirements.txt
    

This implementaion is based on an official BoTorch tutorial: Multi-fidelity Bayesian optimization with discrete fidelities using KG. We followed its comparasions between BayesOpt and MFBayesOpt.

Data Preparation

We opensource the data mixture dataset: admire_ift_runs and use the mixture dataset on the Pile regmix-data from RegMix. We run experiments of different mixtures with Qwen2.5 0.5B / 3B / 7B.

Running experiments

Choose the index of target domain: --idx.
Choose the dataset [admire_ift_runs/pile]: --dataset.
Results will be saved in saved_logs.

Training and recommending with BayesOpt on admire_ift_runs.

python bayesopt_admire_ift_runs.py --idx -3  #average of ood+id

Training and recommending with BayesOpt on the Pile

python bayesopt_thepile.py --idx -1 #average

Training and recommending with MFBayesOpt on admire_ift_runs / the Pile

python mfbayesopt_maxvalue.py --dataset admire_ift_runs --idx -3 #average of ood+id
python mfbayesopt_maxvalue.py --dataset pile --idx -1 #average

Citation

Please use the following to cite this work:

@misc{chen2025admirebayesoptaccelerateddatamixture,
      title={ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization}, 
      author={Shengzhuang Chen and Xu Ouyang and Michael Arthur Leopold Pearce and Thomas Hartvigsen and Jonathan Richard Schwarz},
      year={2025},
      eprint={2508.11551},
      archivePrefix={arXiv},
      primaryClass={stat.ML},
      url={https://arxiv.org/abs/2508.11551}, 
}

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

About

The official codebase of ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages