Repeated Random Sampling for Minimizing the Time-to-Accuracy of Learning (PDF)
To create a conda environment:
conda create -n rs2_env python=3.9
conda activate rs2_env
To create a virtual environment:
python -m venv rs2_env
source rs2_env/bin/activate
To install requirements:
pip install -r requirements.txt
Note: the above installation has been tested using Python 3.9.16 on Linux machines running Ubuntu 20.04.6 LTS. It’s possible some errors may arise on other platforms. For example, we found it necessary to downgrade urllib3 from version 2.0.2 to 1.26.16 on some Mac machines. See link1 and link2 for more details regarding this issue.
More details about downloading and preparing the datasets can be found: here.
Running RS2 w/o replacement on CIFAR10 using ResNet18 for 200 epochs:
python src/DeepCore/main.py --data_path "data/" --dataset "CIFAR10" --n-class 10 --model "ResNet18" --selection "UniformNoReplacement" --epochs 200 --batch-size 128 --fraction 0.1 --per_epoch "True"
In order to train on a static subset each epoch, the parameter --per_epoch
should be set to False
.
We support training on standard datasets by changing the --dataset
parameter to:
CIFAR10
, CIFAR100
, ImageNet30
, ImageNet
, TinyImageNet
. Paramater --n-class
should be set to the number of
different classes for the chosen dataset.
Parameter --selection
defines which method will be used for selecting the subsets. Some of the possible methods are:
Forgetting
, Herding
, Craig
, Uniform
, etc. For a complete list take a look at the names of imported classes:
here.
fraction
represents the selection ratio r, i.e., the percentage of data that will be used per round.
For details about the hyperparameters, and how to change them take a look at arguments.py
.
We provide scripts for running experiments with and without Slurm.
The hyperparameters used to produce the results reported in the paper are set in the corresponding scripts. For details about the hyperparameters, and how to change them take a look at arguments.py
.
Before executing files from scripts/
move them to the main folder.
mv scripts/<script_name> .
Running time-to-accuracy for CIFAR-10:
source run_timeToAcc_cifar10.sh
Running time-to-accuracy for ImageNet-1k:
source run_timeToAcc_imagenet.sh
Running robustness experiments on CIFAR-10 with label noise:
source run_robustness_experiments.sh
Running dataset distillation experiments:
source run_dataset_distillation.sh
Running per-round sampling experiments on CIFAR-10
source run_perround_experiments.sh
More details about analyzing the output files can be found here.
CIFAR-10 | ImageNet-1k |
---|---|
![]() |
![]() |
@inproceedings{
okanovic2024repeated,
title = {Repeated Random Sampling for Minimizing the Time-to-Accuracy of Learning},
author = {Patrik Okanovic and Roger Waleffe and Vasilis Mageirakos and Konstantinos Nikolakakis and Amin Karbasi and Dionysios Kalogerias and Nezihe Merve G{\"u}rel and Theodoros Rekatsinas},
booktitle = {The Twelfth International Conference on Learning Representations},
year = {2024},
url = {https://openreview.net/forum?id=JnRStoIuTe}
}