Unfair comparison on CIFAR100-LT (extra/different data usage)

Hi,

I have a serious concern in `CIFAR-LT/data/data_processing.py`, where you generate the imbalanced dataset.

First, CIFAR100-LT should be generated with the convention you can find [here](https://github.com/kaidic/LDAM-DRW/blob/master/imbalance_cifar.py).
That is, the image samples should be collected after we shuffle the whole train images with a fixed seed 0.
Your code, however, does not perform a shuffling operation and uses different images for the training data.
This makes your version of CIFAR100-LT different from the ones used in the rest of the methods in your table.

Secondly, you first take 25 validation samples from each class for meta-learning, and then construct the imbalanced training data from unused images. This is additional data usage.
For example, CIFAR100-LT with ir=100 has the following distribution of the number of classes:
[500, 477, 455, 434, 415, 396, 378, 361, 344, 328, 314, 299, 286, 273, 260, 248, 237, 226, 216, 206, 197, 188, 179, 171, 163, 156, 149, 142, 135, 129, 123, 118, 112, 107, 102, 98, 93, 89, 85, 81, 77, 74, 70, 67, 64, 61, 58, 56, 53, 51, 48, 46, 44, 42, 40, 38, 36, 35, 33, 32, 30, 29, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 15, 14, 13, 13, 12, 12, 11, 11, 10, 10, 9, 9, 8, 8, 7, 7, 7, 6, 6, 6, 6, 5, 5, 5, 5]

Training and meta-training should be all conducted on the above images, and you cannot separate 25 samples from each class to construct a balanced validation split. (There are classes that has <25 samples.)

Overall, I'm afraid that you might have made an unfair comparison on CIFAR100-LT.
Kindly correct me if I'm wrong, I hope to get a reply.
Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unfair comparison on CIFAR100-LT (extra/different data usage) #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unfair comparison on CIFAR100-LT (extra/different data usage) #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions