Skip to content

LeviViana/torchessian

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Torchessian

This repository aims to provide a tool for analyzing the full Hessian of the loss function for neural networks. For the moment, only a single GPU-mode is enabled, but I have plans to implement a distributed version in the future.

Motivation

I found this article very interesting, and I wanted to reproduce the results very quickly. I've implemented a batch mode spectrum estimation in order to get even faster results.

For instance, when analyzing the impact of having batchnom layers in a ResNet-18 architecture, I found the following spectrum on the test set of CIFAR-10:

Note: both architectures (i.e. with and without batchnorm) were trained until almost reaching a global optimum, i.e. at least 98% of accuracy.

Spectrum of a single batch

alt text

Spectrum of the entire test dataset

alt text

The results are pretty similar, and the conclusion is the same: the batchnorm layers seem to eliminate big positive eigenvalues, which makes the training process easier.

About

Full loss Hessian spectrum approximation tool.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages