Skip to content

Pehlevan-Group/Richness_Sweep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Richness Sweep

Sweeping over the feature learning strength

Datasets

We create the MNIST-1M datasets using the script in the MNIST-1M folder.

Beyond this use CIFAR-10, CIFAR-5M, and TinyImageNet datasets.

Models

The key MLP and CNN scripts are found in the vision_tasks directory. The vision_tasks.py script executes sweeps over various hyperparameters:

  • N: width, ie number of neurons in the hidden layer
  • L: depth, ie the number of
  • B: batch size for the optimizer
  • s: noise level on the labels. We mostly don't sweep over this
  • E: number of ensembles, ie copies of the network with different initialization seeds
  • d: seed for the batch ordering. We mostly don't vary this
  • task: MNIST-1M or CIFAR-5M for these tasks
  • optimizer: Always SGD for now
  • loss: Mean Squared Error (mse) or Cross-Entropy (xent)
  • gamma0_range, eta0_range: the log10 range. E.g. 5 means the hyperparameter will vary from $10^{-5}$ to $10^5$
  • gamma0_res, eta0_res: the number of trials per ten-folding. E.g. 2 means we run a sweep roughly every factor of 3
  • range_below_max_eta0: after the first convergent learning rate, how many factors of 10 down to sweep in eta
  • save_model: whether to save the weights of the model or not

About

Sweeping over the feature learning strength

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages