-
Notifications
You must be signed in to change notification settings - Fork 184
Description
Hello, @HobbitLong
I have succeeded in training with cifar-10 dataset. And now I train with these methods on the custom dataset.
For example, when I apply MoCo method, I change the length of the queue and other parameters. I find that it is very hard to judge the convergence of the model.
The contrastive accuracy can reach a very high level(nearly 95%), and the loss value is nearly stable. However, when I use linear evaluation, the top-1 accuracy is only 50%. I find the contrastive accuracy rate has not direct relationship with the linear evaluation performance.
This means, Sometimes when I use larger queue, and the contrastive accuracy is smaller than the smaller queue length. However, the performance on linear evaluation is that the larger queue is better than a smaller queue.
Can you give me some reason that why the contrastive accuracy is so high but the linear evaluation performance is low value? Or Is there anything that can help me solve this problem.