Skip to content

Commit 04d3293

Browse files
ChongWei905ChongWei905
andauthored
docs: add requirements and graph compile time to readmes (#813)
* docs: add requirements and renew forms for readmes * fix: change ops.pad input format which changed in mindspore in Nov 2022 * docs: add requirements and renew forms for example models (ssd and deeplabv3) * docs: fix readme bugs --------- Co-authored-by: ChongWei905 <weichong4@huawei.com>
1 parent e2b34de commit 04d3293

File tree

58 files changed

+1530
-1548
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+1530
-1548
lines changed

README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -29,13 +29,13 @@ MindCV is an open-source toolbox for computer vision research and development ba
2929

3030
The following is the corresponding `mindcv` versions and supported `mindspore` versions.
3131

32-
| mindcv | mindspore |
33-
|:------:|:----------:|
34-
| main | master |
35-
| v0.4.0 | 2.3.0 |
36-
| 0.3.0 | 2.2.10 |
37-
| 0.2 | 2.0 |
38-
| 0.1 | 1.8 |
32+
| mindcv | mindspore |
33+
| :----: | :---------: |
34+
| main | master |
35+
| v0.4.0 | 2.3.0/2.3.1 |
36+
| 0.3.0 | 2.2.10 |
37+
| 0.2 | 2.0 |
38+
| 0.1 | 1.8 |
3939

4040

4141
### Major Features

benchmark_results.md

Lines changed: 102 additions & 101 deletions
Large diffs are not rendered by default.

configs/README.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -31,24 +31,24 @@ Please follow the outline structure and **table format** shown in [densenet/READ
3131

3232
#### Table Format
3333

34-
<div align="center">
3534

36-
| model | top-1 (%) | top-5 (%) | params (M) | batch size | cards | ms/step | jit_level | recipe | download |
37-
| ----------- | --------- | --------- | ---------- | ---------- | ----- | ------- | --------- | --------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
38-
| densenet121 | 75.67 | 92.77 | 8.06 | 32 | 8 | 47,34 | O2 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/densenet/densenet_121_ascend.yaml) | [weights](https://download-mindspore.osinfra.cn/toolkits/mindcv/densenet/densenet121-bf4ab27f-910v2.ckpt) |
3935

40-
</div>
36+
| model name | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s | acc@top1 | acc@top5 | recipe | weight |
37+
| ----------- | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | --------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
38+
| densenet121 | 8.06 | 8 | 32 | 224x224 | O2 | 300s | 47,34 | 5446.81 | 75.67 | 92.77 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/densenet/densenet_121_ascend.yaml) | [weights](https://download-mindspore.osinfra.cn/toolkits/mindcv/densenet/densenet121-bf4ab27f-910v2.ckpt) |
39+
40+
4141

4242
Illustration:
43-
- Model: model name in lower case with _ seperator.
44-
- Top-1 and Top-5: Accuracy reported on the validatoin set of ImageNet-1K. Keep 2 digits after the decimal point.
45-
- Params (M): # of model parameters in millions (10^6). Keep **2 digits** after the decimal point
46-
- Batch Size: Training batch size
47-
- Cards: # of cards
48-
- Ms/step: Time used on training per step in ms
49-
- Jit_level: Jit level of mindspore context, which contains 3 levels: O0/O1/O2
50-
- Recipe: Training recipe/configuration linked to a yaml config file.
51-
- Download: url of the pretrained model weights
43+
- model name: model name in lower case with _ seperator.
44+
- top-1 and top-5: Accuracy reported on the validatoin set of ImageNet-1K. Keep 2 digits after the decimal point.
45+
- params(M): # of model parameters in millions (10^6). Keep **2 digits** after the decimal point
46+
- batch size: Training batch size
47+
- cards: # of cards
48+
- ms/step: Time used on training per step in ms
49+
- jit level: Jit level of mindspore context, which contains 3 levels: O0/O1/O2
50+
- recipe: Training recipe/configuration linked to a yaml config file.
51+
- weight: url of the pretrained model weights
5252

5353
### Model Checkpoint Format
5454
The checkpoint (i.e., model weight) name should follow this format: **{model_name}_{specification}-{sha256sum}.ckpt**, e.g., `poolformer_s12-5be5c4e4.ckpt`.

configs/bit/README.md

Lines changed: 25 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
> [Big Transfer (BiT): General Visual Representation Learning](https://arxiv.org/abs/1912.11370)
44
5+
56
## Introduction
67

78
Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision.
@@ -12,30 +13,10 @@ is required. 3) Long pre-training time: Pretraining on a larger dataset requires
1213
BiT use GroupNorm combined with Weight Standardisation instead of BatchNorm. Since BatchNorm performs worse when the number of images on each accelerator is
1314
too low. 5) With BiT fine-tuning, good performance can be achieved even if there are only a few examples of each type on natural images.[[1, 2](#References)]
1415

15-
16-
## Results
17-
18-
Our reproduced model performance on ImageNet-1K is reported as follows.
19-
20-
- ascend 910* with graph mode
21-
22-
*coming soon*
23-
24-
- ascend 910 with graph mode
25-
26-
27-
<div align="center">
28-
29-
30-
| model | top-1 (%) | top-5 (%) | params(M) | batch size | cards | ms/step | jit_level | recipe | download |
31-
| ------------ | --------- | --------- | --------- | ---------- | ----- |---------| --------- | ---------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
32-
| bit_resnet50 | 76.81 | 93.17 | 25.55 | 32 | 8 | 74.52 | O2 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/bit/bit_resnet50_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/bit/BiT_resnet50-1e4795a4.ckpt) |
33-
34-
35-
</div>
36-
37-
#### Notes
38-
- Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.
16+
## Requirements
17+
| mindspore | ascend driver | firmware | cann toolkit/kernel |
18+
| :-------: | :-----------: | :---------: | :-----------------: |
19+
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
3920

4021
## Quick Start
4122

@@ -82,6 +63,26 @@ To validate the accuracy of the trained model, you can use `validate.py` and par
8263
python validate.py -c configs/bit/bit_resnet50_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
8364
```
8465

66+
## Performance
67+
68+
Our reproduced model performance on ImageNet-1K is reported as follows.
69+
70+
Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
71+
72+
*coming soon*
73+
74+
Experiments are tested on ascend 910 with mindspore 2.3.1 graph mode.
75+
76+
77+
| model name | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s | acc@top1 | acc@top5 | recipe | weight |
78+
| ------------ | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | ---------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
79+
| bit_resnet50 | 25.55 | 8 | 32 | 224x224 | O2 | 146s | 74.52 | 3413.33 | 76.81 | 93.17 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/bit/bit_resnet50_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/bit/BiT_resnet50-1e4795a4.ckpt) |
80+
81+
82+
83+
### Notes
84+
- top-1 and top-5: Accuracy reported on the validation set of ImageNet-1K.
85+
8586
## References
8687

8788
<!--- Guideline: Citation format should follow GB/T 7714. -->

configs/cmt/README.md

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -2,37 +2,20 @@
22

33
> [CMT: Convolutional Neural Networks Meet Vision Transformers](https://arxiv.org/abs/2107.06263)
44
5+
56
## Introduction
67

78
CMT is a method to make full use of the advantages of CNN and transformers so that the model could capture long-range
89
dependencies and extract local information. In addition, to reduce computation cost, this method use lightweight MHSA(multi-head self-attention)
910
and depthwise convolution and pointwise convolution like MobileNet. By combing these parts, CMT could get a SOTA performance
1011
on ImageNet-1K dataset.
1112

12-
13-
## Results
14-
15-
Our reproduced model performance on ImageNet-1K is reported as follows.
16-
17-
- ascend 910* with graph mode
18-
19-
*coming soon*
20-
21-
- ascend 910 with graph mode
22-
23-
<div align="center">
24-
25-
26-
| model | top-1 (%) | top-5 (%) | params(M) | batch size | cards | ms/step | jit_level | recipe | download |
27-
| --------- | --------- | --------- | --------- | ---------- | ----- |---------| --------- | ------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
28-
| cmt_small | 83.24 | 96.41 | 26.09 | 128 | 8 | 500.64 | O2 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/cmt/cmt_small_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/cmt/cmt_small-6858ee22.ckpt) |
13+
## Requirements
14+
| mindspore | ascend driver | firmware | cann toolkit/kernel |
15+
| :-------: | :-----------: | :---------: | :-----------------: |
16+
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
2917

3018

31-
</div>
32-
33-
#### Notes
34-
- Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.
35-
3619
## Quick Start
3720

3821
### Preparation
@@ -78,6 +61,23 @@ To validate the accuracy of the trained model, you can use `validate.py` and par
7861
python validate.py -c configs/cmt/cmt_small_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
7962
```
8063

64+
## Performance
65+
66+
Our reproduced model performance on ImageNet-1K is reported as follows.
67+
68+
Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
69+
70+
*coming soon*
71+
72+
Experiments are tested on ascend 910 with mindspore 2.3.1 graph mode.
73+
74+
| model name | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s | acc@top1 | acc@top5 | recipe | weight |
75+
| ---------- | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | ------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
76+
| cmt_small | 26.09 | 8 | 128 | 224x224 | O2 | 1268s | 500.64 | 2048.01 | 83.24 | 96.41 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/cmt/cmt_small_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/cmt/cmt_small-6858ee22.ckpt) |
77+
78+
### Notes
79+
- top-1 and top-5: Accuracy reported on the validation set of ImageNet-1K.
80+
8181
## References
8282

8383
<!--- Guideline: Citation format should follow GB/T 7714. -->

configs/coat/README.md

Lines changed: 28 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -6,28 +6,11 @@
66

77
Co-Scale Conv-Attentional Image Transformer (CoaT) is a Transformer-based image classifier equipped with co-scale and conv-attentional mechanisms. First, the co-scale mechanism maintains the integrity of Transformers' encoder branches at individual scales, while allowing representations learned at different scales to effectively communicate with each other. Second, the conv-attentional mechanism is designed by realizing a relative position embedding formulation in the factorized attention module with an efficient convolution-like implementation. CoaT empowers image Transformers with enriched multi-scale and contextual modeling capabilities.
88

9-
## Results
9+
## Requirements
10+
| mindspore | ascend driver | firmware | cann toolkit/kernel |
11+
| :-------: | :-----------: | :---------: | :-----------------: |
12+
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
1013

11-
Our reproduced model performance on ImageNet-1K is reported as follows.
12-
13-
- ascend 910* with graph mode
14-
15-
*coming soon*
16-
17-
18-
- ascend 910 with graph mode
19-
20-
<div align="center">
21-
22-
23-
| model | top-1 (%) | top-5 (%) | params (M) | batch size | cards | ms/step | jit_level | recipe | Weight |
24-
| --------- | --------- | --------- | ---------- | ---------- | ----- |---------| --------- | -------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
25-
| coat_tiny | 79.67 | 94.88 | 5.50 | 32 | 8 | 254.95 | O2 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/coat/coat_tiny_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/coat/coat_tiny-071cb792.ckpt) |
26-
27-
</div>
28-
29-
#### Notes
30-
- Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.
3114

3215

3316
## Quick Start
@@ -74,6 +57,30 @@ To validate the accuracy of the trained model, you can use `validate.py` and par
7457
python validate.py -c configs/coat/coat_lite_tiny_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
7558
```
7659

60+
## Performance
61+
62+
Our reproduced model performance on ImageNet-1K is reported as follows.
63+
64+
Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
65+
66+
*coming soon*
67+
68+
69+
Experiments are tested on ascend 910 with mindspore 2.3.1 graph mode.
70+
71+
72+
73+
74+
| model name | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s | acc@top1 | acc@top5 | recipe | weight |
75+
| ---------- | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | -------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
76+
| coat_tiny | 5.50 | 8 | 32 | 224x224 | O2 | 543s | 254.95 | 1003.92 | 79.67 | 94.88 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/coat/coat_tiny_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/coat/coat_tiny-071cb792.ckpt) |
77+
78+
79+
80+
### Notes
81+
- top-1 and top-5: Accuracy reported on the validation set of ImageNet-1K.
82+
83+
7784
## References
7885

7986
[1] Han D, Yun S, Heo B, et al. Rethinking channel dimensions for efficient model design[C]//Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 2021: 732-741.

configs/convit/README.md

Lines changed: 25 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# ConViT
22
> [ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases](https://arxiv.org/abs/2103.10697)
33
4+
45
## Introduction
56

67
ConViT combines the strengths of convolutional architectures and Vision Transformers (ViTs).
@@ -19,36 +20,12 @@ while offering a much improved sample efficiency.[[1](#references)]
1920
<em>Figure 1. Architecture of ConViT [<a href="#references">1</a>] </em>
2021
</p>
2122

22-
23-
## Results
24-
25-
Our reproduced model performance on ImageNet-1K is reported as follows.
26-
27-
- ascend 910* with graph mode
28-
29-
30-
<div align="center">
31-
32-
33-
| model | top-1 (%) | top-5 (%) | params (M) | batch size | cards | ms/step | jit_level | recipe | download |
34-
| ----------- | --------- | --------- | ---------- | ---------- | ----- | ------- | --------- | ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------- |
35-
| convit_tiny | 73.79 | 91.70 | 5.71 | 256 | 8 | 226.51 | O2 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/convit/convit_tiny_ascend.yaml) | [weights](https://download-mindspore.osinfra.cn/toolkits/mindcv/convit/convit_tiny-1961717e-910v2.ckpt) |
36-
37-
</div>
38-
39-
- ascend 910 with graph mode
40-
41-
<div align="center">
23+
## Requirements
24+
| mindspore | ascend driver | firmware | cann toolkit/kernel |
25+
| :-------: | :-----------: | :---------: | :-----------------: |
26+
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
4227

4328

44-
| model | top-1 (%) | top-5 (%) | params (M) | batch size | cards | ms/step | jit_level | recipe | download |
45-
| ----------- | --------- | --------- | ---------- | ---------- | ----- | ------- | --------- | ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------- |
46-
| convit_tiny | 73.66 | 91.72 | 5.71 | 256 | 8 | 231.62 | O2 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/convit/convit_tiny_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/convit/convit_tiny-e31023f2.ckpt) |
47-
48-
</div>
49-
50-
#### Notes
51-
- Top-1 and Top-5: Accuracy reported on the validation set of ImageNet-1K.
5229

5330
## Quick Start
5431

@@ -93,6 +70,26 @@ To validate the accuracy of the trained model, you can use `validate.py` and par
9370
python validate.py -c configs/convit/convit_tiny_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
9471
```
9572

73+
## Performance
74+
75+
Our reproduced model performance on ImageNet-1K is reported as follows.
76+
77+
Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
78+
79+
| model name | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s | acc@top1 | acc@top5 | recipe | weight |
80+
| ----------- | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------- |
81+
| convit_tiny | 5.71 | 8 | 256 | 224x224 | O2 | 153s | 226.51 | 9022.03 | 73.79 | 91.70 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/convit/convit_tiny_ascend.yaml) | [weights](https://download-mindspore.osinfra.cn/toolkits/mindcv/convit/convit_tiny-1961717e-910v2.ckpt) |
82+
83+
Experiments are tested on ascend 910 with mindspore 2.3.1 graph mode.
84+
85+
| model name | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s | acc@top1 | acc@top5 | recipe | weight |
86+
| ----------- | --------- | ----- | ---------- | ---------- | --------- | ------------- | ------- | ------- | -------- | -------- | ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------- |
87+
| convit_tiny | 5.71 | 8 | 256 | 224x224 | O2 | 133s | 231.62 | 8827.59 | 73.66 | 91.72 | [yaml](https://github.com/mindspore-lab/mindcv/blob/main/configs/convit/convit_tiny_ascend.yaml) | [weights](https://download.mindspore.cn/toolkits/mindcv/convit/convit_tiny-e31023f2.ckpt) |
88+
89+
90+
### Notes
91+
- top-1 and top-5: Accuracy reported on the validation set of ImageNet-1K.
92+
9693
## References
9794

9895
<!--- Guideline: Citation format should follow GB/T 7714. -->

0 commit comments

Comments
 (0)