Skip to content

vaden4d/logo-classifier

Repository files navigation

Logo classifier

Almost unsupervised logo classifier

The models discriminate logo into two categories: real logo and no logo (dataset consists of logo proposals from the previous pipelines). Classification was done via supervised and semi-supervised approaches on the weakly generated labels from data. For more details of the preprocessing steps, see Preprocessing.ipynb or Preprocessing.html.

Weak Precision and weak Recall - metrics on the weak labels - are monitored during training of the model, the results reported on the test dataset. For validation - validation Precision and validation Recall - were used manually labeled ~2000 pictures.

To run training with different settings, see arguments in the train.py file. The unzipped dataset should be located in this folder. For example, run the supervised training and validate it on the labeled part.

python train.py --validate --csv datasets/mixed.csv --target_column mixed_label --validation_csv labeled_part.csv

Semi-supervised training pipeline:

python train.py --ssl --validate --csv datasets/mixed.csv --target_column mixed_label --validation_csv labeled_part.csv

See Report.ipynb for more details of the training pipelines.

Total experiments table:

Method weak Precision weak Recall val. Precision val. Recall
Always predict logo ~0.7 1.0 0.3793 1.0
EfficientNet-B1 + Weak Labels 0.8420 0.8803 0.4378 0.7827
Semi-Supervised MixMatch + EfficientNet-B1 + Weak Labels 0.8509 0.7806 0.4500 0.7016
Supervised EfficientNet-B1 + Weak Labels + 1000 Strong Labels 0.7682 0.8093 0.4783 0.7799
Semi-Supervised MixMatch + EfficientNet-B1 + Weak Labels + 1000 Strong Labels 0.7906 0.7023 0.4990 0.7011
Anomaly detection TODO TODO TODO TODO

About

Almost unsupervised logo classifier

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published