Skip to content

Commit 39ca504

Browse files
authored
Merge branch 'dev-1.x' into fix_nas
2 parents 6118cf5 + 90c5435 commit 39ca504

File tree

155 files changed

+5265
-1053
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

155 files changed

+5265
-1053
lines changed

.gitignore

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@ __pycache__/
1111
.Python
1212
build/
1313
develop-eggs/
14-
dist/
1514
downloads/
1615
eggs/
1716
.eggs/

README.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,20 @@
4545

4646
English | [简体中文](README_zh-CN.md)
4747

48+
<div align="center">
49+
<a href="https://openmmlab.medium.com/" style="text-decoration:none;">
50+
<img src="https://user-images.githubusercontent.com/25839884/218352562-cdded397-b0f3-4ca1-b8dd-a60df8dca75b.png" width="3%" alt="" /></a>
51+
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
52+
<a href="https://discord.gg/raweFPmdzG" style="text-decoration:none;">
53+
<img src="https://user-images.githubusercontent.com/25839884/218347213-c080267f-cbb6-443e-8532-8e1ed9a58ea9.png" width="3%" alt="" /></a>
54+
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
55+
<a href="https://twitter.com/OpenMMLab" style="text-decoration:none;">
56+
<img src="https://user-images.githubusercontent.com/25839884/218346637-d30c8a0f-3eba-4699-8131-512fb06d46db.png" width="3%" alt="" /></a>
57+
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
58+
<a href="https://www.youtube.com/openmmlab" style="text-decoration:none;">
59+
<img src="https://user-images.githubusercontent.com/25839884/218346691-ceb2116a-465a-40af-8424-9f30d2348ca9.png" width="3%" alt="" /></a>
60+
</div>
61+
4862
</div>
4963

5064
## Introduction
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# dataset settings
2+
dataset_type = 'mmcls.ImageNet'
3+
4+
max_search_epochs = 100
5+
# learning rate setting
6+
param_scheduler = [
7+
# warm up learning rate scheduler
8+
dict(
9+
type='LinearLR',
10+
start_factor=0.5,
11+
by_epoch=True,
12+
begin=0,
13+
end=10,
14+
convert_to_iter_based=True),
15+
dict(
16+
type='CosineAnnealingLR',
17+
T_max=max_search_epochs,
18+
eta_min=0.08,
19+
by_epoch=True,
20+
begin=10,
21+
end=max_search_epochs,
22+
convert_to_iter_based=True),
23+
]
24+
25+
# optimizer setting
26+
paramwise_cfg = dict(norm_decay_mult=0.0, bias_decay_mult=0.0)
27+
28+
optim_wrapper = dict(
29+
constructor='mmrazor.SeparateOptimWrapperConstructor',
30+
architecture=dict(
31+
type='OptimWrapper',
32+
optimizer=dict(type='SGD', lr=0.5, momentum=0.9, weight_decay=3e-4),
33+
paramwise_cfg=paramwise_cfg),
34+
mutator=dict(
35+
type='OptimWrapper',
36+
optimizer=dict(type='Adam', lr=0.5, weight_decay=1e-3)))
37+
38+
# data preprocessor
39+
data_preprocessor = dict(
40+
type='mmcls.ClsDataPreprocessor',
41+
# RGB format normalization parameters
42+
mean=[123.675, 116.28, 103.53],
43+
std=[58.395, 57.12, 57.375],
44+
# convert image from BGR to RGB
45+
to_rgb=True,
46+
)
47+
train_pipeline = [
48+
dict(type='LoadImageFromFile'),
49+
dict(type='RandomResizedCrop', scale=224),
50+
dict(type='ColorJitter', brightness=0.2, contrast=0.2, saturation=0.2),
51+
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
52+
dict(type='PackClsInputs'),
53+
]
54+
55+
test_pipeline = [
56+
dict(type='LoadImageFromFile'),
57+
dict(type='ResizeEdge', scale=256, edge='short'),
58+
dict(type='CenterCrop', crop_size=224),
59+
dict(type='PackClsInputs'),
60+
]
61+
62+
train_dataloader = dict(
63+
batch_size=64,
64+
num_workers=4,
65+
dataset=dict(
66+
type=dataset_type,
67+
data_root='data/imagenet',
68+
ann_file='meta/train.txt',
69+
data_prefix='train',
70+
pipeline=train_pipeline),
71+
sampler=dict(type='DefaultSampler', shuffle=True, _scope_='mmcls'),
72+
persistent_workers=True,
73+
)
74+
75+
val_dataloader = dict(
76+
batch_size=64,
77+
num_workers=4,
78+
dataset=dict(
79+
type=dataset_type,
80+
data_root='data/imagenet',
81+
ann_file='meta/val.txt',
82+
data_prefix='val',
83+
pipeline=test_pipeline),
84+
sampler=dict(type='DefaultSampler', shuffle=True, _scope_='mmcls'),
85+
persistent_workers=True,
86+
)
87+
val_evaluator = dict(type='mmcls.Accuracy', topk=(1, 5))
88+
89+
# If you want standard test, please manually configure the test dataset
90+
test_dataloader = val_dataloader
91+
test_evaluator = val_evaluator
92+
93+
evaluation = dict(interval=1, metric='accuracy')
94+
95+
train_cfg = dict(by_epoch=True, max_epochs=max_search_epochs, val_interval=1)
96+
val_cfg = dict()
97+
test_cfg = dict()
98+
custom_hooks = [dict(type='DMCPSubnetHook')]

configs/distill/mmcls/dist/README.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# KD
2+
3+
> [Knowledge Distillation from A Stronger Teacher](https://arxiv.org/abs/2205.10536)
4+
5+
<!-- [ALGORITHM] -->
6+
7+
## Abstract
8+
9+
Unlike existing knowledge distillation methods focus on the baseline settings, where the teacher models and training strategies are not that strong and competing as state-of-the-art approaches, this paper presents a method dubbed DIST to distill better from a stronger teacher. We empirically find that the discrepancy of predictions between the student and a stronger teacher may tend to be fairly severer. As a result, the exact match of predictions in KL divergence would disturb the training and make existing methods perform poorly. In this paper, we show that simply preserving the relations between the predictions of teacher and student would suffice, and propose a correlation-based loss to capture the intrinsic inter-class relations from the teacher explicitly. Besides, considering that different instances have different semantic similarities to each class, we also extend this relational match to the intra-class level. Our method is simple yet practical, and extensive experiments demonstrate that it adapts well to various architectures, model sizes and training strategies, and can achieve state-of-the-art performance consistently on image classification, object detection, and semantic segmentation tasks. Code is available at: this https URL .
10+
11+
## Results and models
12+
13+
### Classification
14+
15+
| Location | Dataset | Teacher | Student | Acc | Acc(T) | Acc(S) | Config | Download |
16+
| :------: | :------: | :---------------: | :---------------: | :---: | :----: | :----: | :-----------------: | :--------------------------------------------------------------- |
17+
| logits | ImageNet | [resnet34][r34_c] | [resnet18][r18_c] | 71.61 | 73.62 | 69.90 | [config][distill_c] | [teacher][r34_pth] \| [model][distill_pth] \| [log][distill_log] |
18+
19+
**Note**
20+
21+
There are fluctuations in the results of the experiments of DIST loss. For example, we run three times of the official code of DIST and get three different results.
22+
23+
| Time | Top-1 |
24+
| ---- | ----- |
25+
| 1th | 71.69 |
26+
| 2nd | 71.82 |
27+
| 3rd | 71.90 |
28+
29+
## Citation
30+
31+
```latex
32+
@article{huang2022knowledge,
33+
title={Knowledge Distillation from A Stronger Teacher},
34+
author={Huang, Tao and You, Shan and Wang, Fei and Qian, Chen and Xu, Chang},
35+
journal={arXiv preprint arXiv:2205.10536},
36+
year={2022}
37+
}
38+
```
39+
40+
[distill_c]: ./dist_logits_resnet34_resnet18_8xb32_in1k.py
41+
[distill_log]: https://download.openmmlab.com/mmrazor/v1/distillation/dist_logits_resnet34_resnet18_8xb32_in1k.json
42+
[distill_pth]: https://download.openmmlab.com/mmrazor/v1/distillation/dist_logits_resnet34_resnet18_8xb32_in1k.pth
43+
[r18_c]: https://github.com/open-mmlab/mmclassification/blob/dev-1.x/configs/resnet/resnet18_8xb32_in1k.py
44+
[r34_c]: https://github.com/open-mmlab/mmclassification/blob/dev-1.x/configs/resnet/resnet34_8xb32_in1k.py
45+
[r34_pth]: https://download.openmmlab.com/mmclassification/v0/resnet/resnet34_8xb32_in1k_20210831-f257d4e6.pth
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
_base_ = [
2+
'mmcls::_base_/datasets/imagenet_bs32.py',
3+
'mmcls::_base_/schedules/imagenet_bs256.py',
4+
'mmcls::_base_/default_runtime.py'
5+
]
6+
7+
teacher_ckpt = 'https://download.openmmlab.com/mmclassification/v0/resnet/resnet34_8xb32_in1k_20210831-f257d4e6.pth' # noqa: E501
8+
9+
model = dict(
10+
_scope_='mmrazor',
11+
type='SingleTeacherDistill',
12+
data_preprocessor=dict(
13+
type='ImgDataPreprocessor',
14+
# RGB format normalization parameters
15+
mean=[123.675, 116.28, 103.53],
16+
std=[58.395, 57.12, 57.375],
17+
# convert image from BGR to RGB
18+
bgr_to_rgb=True),
19+
architecture=dict(
20+
cfg_path='mmcls::resnet/resnet18_8xb32_in1k.py', pretrained=False),
21+
teacher=dict(
22+
cfg_path='mmcls::resnet/resnet34_8xb32_in1k.py', pretrained=False),
23+
teacher_ckpt=teacher_ckpt,
24+
distiller=dict(
25+
type='ConfigurableDistiller',
26+
student_recorders=dict(
27+
fc=dict(type='ModuleOutputs', source='head.fc')),
28+
teacher_recorders=dict(
29+
fc=dict(type='ModuleOutputs', source='head.fc')),
30+
distill_losses=dict(
31+
loss_kl=dict(
32+
type='DISTLoss',
33+
inter_loss_weight=1.0,
34+
intra_loss_weight=1.0,
35+
tau=1,
36+
loss_weight=2,
37+
)),
38+
loss_forward_mappings=dict(
39+
loss_kl=dict(
40+
logits_S=dict(from_student=True, recorder='fc'),
41+
logits_T=dict(from_student=False, recorder='fc')))))
42+
43+
val_cfg = dict(_delete_=True, type='mmrazor.SingleTeacherDistillValLoop')
44+
45+
optim_wrapper = dict(optimizer=dict(nesterov=True))
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
backbone.base_embed_dims:
2+
chosen: 64
3+
backbone.blocks.0.attn.mutable_attrs.num_heads:
4+
chosen: 10
5+
backbone.blocks.0.middle_channels:
6+
chosen: 3.5
7+
backbone.blocks.0.mutable_mlp_ratios:
8+
chosen: 3.5
9+
backbone.blocks.0.mutable_q_embed_dims:
10+
chosen: 10
11+
backbone.blocks.1.attn.mutable_attrs.num_heads:
12+
chosen: 10
13+
backbone.blocks.1.middle_channels:
14+
chosen: 3.5
15+
backbone.blocks.1.mutable_mlp_ratios:
16+
chosen: 3.5
17+
backbone.blocks.1.mutable_q_embed_dims:
18+
chosen: 64
19+
backbone.blocks.10.attn.mutable_attrs.num_heads:
20+
chosen: 10
21+
backbone.blocks.10.middle_channels:
22+
chosen: 4.0
23+
backbone.blocks.10.mutable_mlp_ratios:
24+
chosen: 4.0
25+
backbone.blocks.10.mutable_q_embed_dims:
26+
chosen: 64
27+
backbone.blocks.11.attn.mutable_attrs.num_heads:
28+
chosen: 10
29+
backbone.blocks.11.middle_channels:
30+
chosen: 576
31+
backbone.blocks.11.mutable_mlp_ratios:
32+
chosen: 4.0
33+
backbone.blocks.11.mutable_q_embed_dims:
34+
chosen: 10
35+
backbone.blocks.12.attn.mutable_attrs.num_heads:
36+
chosen: 9
37+
backbone.blocks.12.middle_channels:
38+
chosen: 4.0
39+
backbone.blocks.12.mutable_mlp_ratios:
40+
chosen: 4.0
41+
backbone.blocks.12.mutable_q_embed_dims:
42+
chosen: 9
43+
backbone.blocks.13.attn.mutable_attrs.num_heads:
44+
chosen: 10
45+
backbone.blocks.13.middle_channels:
46+
chosen: 4.0
47+
backbone.blocks.13.mutable_mlp_ratios:
48+
chosen: 4.0
49+
backbone.blocks.13.mutable_q_embed_dims:
50+
chosen: 10
51+
backbone.blocks.14.attn.mutable_attrs.num_heads:
52+
chosen: 8
53+
backbone.blocks.14.middle_channels:
54+
chosen: 576
55+
backbone.blocks.14.mutable_mlp_ratios:
56+
chosen: 3.5
57+
backbone.blocks.14.mutable_q_embed_dims:
58+
chosen: 8
59+
backbone.blocks.15.attn.mutable_attrs.num_heads:
60+
chosen: 10
61+
backbone.blocks.15.middle_channels:
62+
chosen: 3.0
63+
backbone.blocks.15.mutable_mlp_ratios:
64+
chosen: 3.0
65+
backbone.blocks.15.mutable_q_embed_dims:
66+
chosen: 10
67+
backbone.blocks.2.attn.mutable_attrs.num_heads:
68+
chosen: 10
69+
backbone.blocks.2.middle_channels:
70+
chosen: 576
71+
backbone.blocks.2.mutable_mlp_ratios:
72+
chosen: 3.5
73+
backbone.blocks.2.mutable_q_embed_dims:
74+
chosen: 10
75+
backbone.blocks.3.attn.mutable_attrs.num_heads:
76+
chosen: 8
77+
backbone.blocks.3.middle_channels:
78+
chosen: 4.0
79+
backbone.blocks.3.mutable_mlp_ratios:
80+
chosen: 4.0
81+
backbone.blocks.3.mutable_q_embed_dims:
82+
chosen: 8
83+
backbone.blocks.4.attn.mutable_attrs.num_heads:
84+
chosen: 10
85+
backbone.blocks.4.middle_channels:
86+
chosen: 576
87+
backbone.blocks.4.mutable_mlp_ratios:
88+
chosen: 3.0
89+
backbone.blocks.4.mutable_q_embed_dims:
90+
chosen: 10
91+
backbone.blocks.5.attn.mutable_attrs.num_heads:
92+
chosen: 9
93+
backbone.blocks.5.middle_channels:
94+
chosen: 3.0
95+
backbone.blocks.5.mutable_mlp_ratios:
96+
chosen: 3.0
97+
backbone.blocks.5.mutable_q_embed_dims:
98+
chosen: 9
99+
backbone.blocks.6.attn.mutable_attrs.num_heads:
100+
chosen: 8
101+
backbone.blocks.6.middle_channels:
102+
chosen: 576
103+
backbone.blocks.6.mutable_mlp_ratios:
104+
chosen: 3.5
105+
backbone.blocks.6.mutable_q_embed_dims:
106+
chosen: 8
107+
backbone.blocks.7.attn.mutable_attrs.num_heads:
108+
chosen: 8
109+
backbone.blocks.7.middle_channels:
110+
chosen: 3.5
111+
backbone.blocks.7.mutable_mlp_ratios:
112+
chosen: 3.5
113+
backbone.blocks.7.mutable_q_embed_dims:
114+
chosen: 8
115+
backbone.blocks.8.attn.mutable_attrs.num_heads:
116+
chosen: 9
117+
backbone.blocks.8.middle_channels:
118+
chosen: 576
119+
backbone.blocks.8.mutable_mlp_ratios:
120+
chosen: 4.0
121+
backbone.blocks.8.mutable_q_embed_dims:
122+
chosen: 9
123+
backbone.blocks.9.attn.mutable_attrs.num_heads:
124+
chosen: 8
125+
backbone.blocks.9.middle_channels:
126+
chosen: 576
127+
backbone.blocks.9.mutable_mlp_ratios:
128+
chosen: 4.0
129+
backbone.blocks.9.mutable_q_embed_dims:
130+
chosen: 8
131+
backbone.mutable_depth:
132+
chosen: 14
133+
backbone.mutable_embed_dims:
134+
chosen: 576

configs/nas/mmcls/autoformer/README.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -44,15 +44,16 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh \
4444
```bash
4545
CUDA_VISIBLE_DEVICES=0 PORT=29500 ./tools/dist_test.sh \
4646
configs/nas/mmcls/autoformer/autoformer_subnet_8xb128_in1k.py \
47-
$STEP2_CKPT 1 --work-dir $WORK_DIR \
48-
--cfg-options algorithm.mutable_cfg=$STEP2_SUBNET_YAML
47+
none 1 --work-dir $WORK_DIR \
48+
--cfg-options model.init_cfg.checkpoint=$STEP1_CKPT model.init_weight_from_supernet=True
49+
4950
```
5051

5152
## Results and models
5253

53-
| Dataset | Supernet | Subnet | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download | Remarks |
54-
| :------: | :------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-------: | :------: | :-------: | :-------: | :---------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------------: |
55-
| ImageNet | vit | [mutable](https://openmmlab-share.oss-cn-hangzhou.aliyuncs.com/mmrazor/v0.1/nas/spos/spos_shufflenetv2_subnet_8xb128_in1k/spos_shufflenetv2_subnet_8xb128_in1k_flops_0.33M_acc_73.87_20211222-454627be_mutable_cfg.yaml?versionId=CAEQHxiBgICw5b6I7xciIGY5MjVmNWFhY2U5MjQzN2M4NDViYzI2YWRmYWE1YzQx) | 52.472 | 10.2 | 82.48 | 95.99 | [config](./autoformer_supernet_32xb256_in1k.py) | [model](https://openmmlab-share.oss-cn-hangzhou.aliyuncs.com/mmrazor/x.pth) \| [log](https://openmmlab-share.oss-cn-hangzhou.aliyuncs.com/mmrazor/v0.1/nas/spos/x.log.json) | MMRazor searched |
54+
| Dataset | Supernet | Subnet | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download | Remarks |
55+
| :------: | :------: | :----------------------------------------------------------------: | :-------: | :------: | :-------: | :-------: | :---------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------------: |
56+
| ImageNet | vit | [mutable](./configs/nas/mmcls/autoformer/AUTOFORMER_SUBNET_B.yaml) | 54.319 | 10.57 | 82.47 | 95.99 | [config](./autoformer_supernet_32xb256_in1k.py) | [model](https://download.openmmlab.com/mmrazor/v1/autoformer/autoformer_supernet_32xb256_in1k_20220919_110144-c658ce8f.pth) \| [log](https://download.openmmlab.com/mmrazor/v1/autoformer/autoformer_supernet_32xb256_in1k_20220919_110144-c658ce8f.json) | MMRazor searched |
5657

5758
**Note**:
5859

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
_base_ = 'autoformer_supernet_32xb256_in1k.py'
2+
3+
model = dict(
4+
_scope_='mmrazor',
5+
type='sub_model',
6+
cfg=_base_.supernet,
7+
# NOTE: You can replace the yaml with the mutable_cfg searched by yourself
8+
fix_subnet='configs/nas/mmcls/autoformer/AUTOFORMER_SUBNET_B.yaml',
9+
# You can also load the checkpoint of supernet instead of the specific
10+
# subnet by modifying the `checkpoint`(path) in the following `init_cfg`
11+
# with `init_weight_from_supernet = True`.
12+
init_weight_from_supernet=False,
13+
init_cfg=dict(
14+
type='Pretrained',
15+
checkpoint= # noqa: E251
16+
'https://download.openmmlab.com/mmrazor/v1/autoformer/autoformer_supernet_32xb256_in1k_20220919_110144-c658ce8f.pth', # noqa: E501
17+
prefix='architecture.'))

0 commit comments

Comments
 (0)