Overview

The goal of this project was to implement a Coteaching approach using two models to address the challenge of noisy labels. By leveraging the disagreement between the models, a smaller subset of potentially cleaner labels could enhance the robustness of the trained model.

The model architecture implements a CNN model with multiple convolutional layers and batch normalization. It applies leaky ReLU activation, pooling, and dropout operations to extract features from input images and generate predictions. This model was adapted from the research paper linked below.

Models

Simple ResNet

Attempted to improve performance with label smoothing, dropout, and other regularization techniques - just under 50% test accuracy
CNN w/ Coteaching

Achieved similar performance as ResNet model unfortunately. Changes may need to be made to the loss function to account for the added complexity of 100 classes.

*Addendum, in order to improve validation accuracy, a few things were added-- firstly, data augmentation was mistakenly being performed on the validation set, which is no longer the case. Further, addtl data augmentation was added to help the model generalize even further in order to prevent severe overfit on the training data. In which case, the results of the validation accuracy increased by 12%, and the testing accuracy increased by about 3-4%.

** By reducing the batch size to just 32, we increased the amount of regularization in training-- as well as splitting the data randomly by indices were able to achieve much better validation accuracy. Further hyperparameter tuning would increase the performance of the model in testing accuracy (not shown due to training time, but around 69% accuracy)

Dataset

This model uses a modified version of CIFAR100-NoisyLabel

Results

Coteaching Model

Thoughts

The SimpleResNet and other methods (not shown) begin to overfit after just 12 epochs, while data augmentation and other methods may help, it's clear that the data is simply too noisy, too complex, and simply unattainable with other methods.

I propose Coteaching, a method that uses two different models to improve confidence of the models overall.

References

https://github.com/yeachan-kr/pytorch-coteaching/ https://arxiv.org/pdf/1804.06872.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
dataset		dataset
results		results
.DS_Store		.DS_Store
.gitignore		.gitignore
COLEARNINGresults.png		COLEARNINGresults.png
PreResNet.py		PreResNet.py
README.md		README.md
RESNETcurve.png		RESNETcurve.png
coteaching.ipynb		coteaching.ipynb
datasets.py		datasets.py
l_curve.png		l_curve.png
lab3.ipynb		lab3.ipynb
model.py		model.py
requirements.txt		requirements.txt
simpleresnet.ipynb		simpleresnet.ipynb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Models

Dataset

Results

Thoughts

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

iceorb/Coteaching

Folders and files

Latest commit

History

Repository files navigation

Overview

Models

Dataset

Results

Thoughts

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages