Merge pull request #171 from HaoyangLee/main

HaoyangLee · web-flow · commit 0effe3b4d2b1 · 2023-04-04T19:54:32.000+08:00
update docs
diff --git a/README.md b/README.md
@@ -52,7 +52,13 @@ For distributed training, please install [openmpi 4.0.3](https://www.open-mpi.or
 | MindSpore   | >=1.9   |
 | Python      | >=3.7   |
 
-> Notes: If you [use MX Engine for Inference](#21-inference-with-mx-engine), the version of Python should be 3.9.
+> Notes: 
+> - If you [use MX Engine for Inference](#21-inference-with-mx-engine), the version of Python should be 3.9.
+> - If scikit_image cannot be imported, you can use the following command line to set environment variable `$LD_PRELOAD` referring to [here](https://github.com/opencv/opencv/issues/14884). Change `path/to` to your directory.
+>   ```shell
+>   export LD_PRELOAD=path/to/scikit_image.libs/libgomp-d22c30c5.so.1.0.0:$LD_PRELOAD
+>   ```
+
 
 ### Install with PyPI
 
diff --git a/README_CN.md b/README_CN.md
@@ -48,7 +48,13 @@ pip install -r requirements.txt
 | MindSpore   | >=1.9 |
 | Python      | >=3.7 |
 
-> 注意：如果[使用MX Engine推理](#21-使用mx-engine推理)，Python版本需为3.9。
+
+> 注意：
+> - 如果[使用MX Engine推理](#21-使用mx-engine推理)，Python版本需为3.9。
+> - 如果遇到scikit_image导入错误，参考[此处](https://github.com/opencv/opencv/issues/14884)，你需要设置环境变量`$LD_PRELOAD`，命令如下。替换`path/to`为你的目录。
+>   ```shell
+>   export LD_PRELOAD=path/to/scikit_image.libs/libgomp-d22c30c5.so.1.0.0:$LD_PRELOAD
+>   ```
 
 ### 通过PyPI安装
 
@@ -83,7 +89,7 @@ MindOCR支持多种文本识别模型及数据集，在此我们使用**CRNN**
 
 MX ([MindX](https://www.hiascend.com/zh/software/mindx-sdk)的缩写) 是一个支持昇腾设备高效推理与部署的工具。
 
-MindOCR集成了MX推理引擎，支持文本检测识别任务，请参考[mx_infer](docs/cn/inference_tutorial_cn.md).
+MindOCR集成了MX推理引擎，支持文本检测识别任务，请参考[mx_infer](docs/cn/inference_tutorial_cn.md)。
 
 
 #### 2.2 使用Lite推理 
diff --git a/configs/det/dbnet/README.md b/configs/det/dbnet/README.md
@@ -37,13 +37,17 @@ The overall architecture of DBNet is presented in _Figure 1._ It consists of mul
 ### ICDAR2015
 <div align="center">
 
-| **Model** |  **Context** | **Backbone** | **Pretrained** | **Recall** | **Precision** | **F-score** | **Train T. (s/epoch)** | **Recipe**  | **Download**  |
-|---------------|--------------|----------------|------------|---------------|-------------|-----------------------------|-------------------------------------------|---------------------------------------------------| ----------|
-| DBNet (ours)     | D910x1-MS1.9-G | ResNet-50    | ImageNet       | 81.70%     | 85.84%        | 83.72%   | 35   | [yaml](db_r50_icdar15.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50-db1df47a.ckpt) |
-| DBNet (PaddleOCR) | - | ResNet50_vd  | SynthText      | 78.72%     | 86.41%        | 82.38%      | - | - | -|
+| **Model**          | **Context**    | **Backbone**   | **Pretrained** | **Recall**  | **Precision** | **F-score** | **Train T. (s/epoch)** | **Recipe**                  | **Download**                                                                                 |
+|--------------------|----------------|----------------|----------------|-------------|-------------|-------------|------------------------|-----------------------------|----------------------------------------------------------------------------------------------|
+| DBNet (ours)       | D910x1-MS1.9-G | ResNet-50      | ImageNet       | 81.70%      | 85.84%        | 83.72%      | 35                     | [yaml](db_r50_icdar15.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50-db1df47a.ckpt) |
+| DBNet (PaddleOCR)  | -              | ResNet50_vd    | SynthText      | 78.72%      | 86.41%        | 82.38%      | -                      | -                           | -                                                                                            |
+| DBNet++            | D910x1-MS1.9-G | ResNet-50      | ImageNet       | 82.02%      | 87.38%        | 84.62%      | -                      | -                           | -                                                                                            |
 
 </div>
 
+> More information of DBNet++ is coming soon. The only difference between _DBNet_ and _DBNet++_ is in the _Adaptive Scale Fusion_ module, which is controlled by the `use_asf` parameter in the `neck` module in yaml config file.
+
+
 #### Notes
 - Context: Training context denoted as {device}x{pieces}-{MS version}{MS mode}, where mindspore mode can be G - graph mode or F - pynative mode with ms function. For example, D910x8-G is for training on 8 pieces of Ascend 910 NPU using graph mode.
 - Note that the training time of DBNet is highly affected by data processing and varies on different machines. 
@@ -88,7 +92,7 @@ specifically the following parts. The `dataset_root` will be concatenated with `
 ...
 train:
   ckpt_save_dir: './tmp_det'
-  dataset_sink_mode: True
+  dataset_sink_mode: False
   dataset:
     type: DetDataset
     dataset_root: dir/to/dataset          <--- Update
diff --git a/configs/det/dbnet/README_CN.md b/configs/det/dbnet/README_CN.md
@@ -28,13 +28,16 @@ DBNet的整体架构图如图1所示，包含以下阶段:
 ### ICDAR2015
 <div align="center">
 
-| **模型** | **环境配置** | **骨干网络** | **预训练数据集** | **Recall** | **Precision** | **F-score** | **训练时间(s/epoch)** | **配置文件**                           | **模型权重下载**                                                                               |
-|------------------|------------|-------------|----------------|------------|---------------|-------------|-------------------------------|-----------------------------|-----------------------------------------------------------------|
-| DBNet (ours)     | D910x1-MS1.9-G |  ResNet-50    | ImageNet       | 81.70%     | 85.84%        | 83.72%   |   35 | [yaml](db_r50_icdar15.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50-db1df47a.ckpt) |
-| DBNet (PaddleOCR)|  - | ResNet50_vd  | SynthText      | 78.72%     | 86.41%        | 82.38%      |  - | - | -|
+| **模型**            | **环境配置**       | **骨干网络**      | **预训练数据集**  | **Recall**  | **Precision** | **F-score** | **训练时间(s/epoch)** | **配置文件**                    | **模型权重下载**                                                                                   |
+|-------------------|----------------|---------------|-------------|-------------|---------------|-------------|-------------------|-----------------------------|----------------------------------------------------------------------------------------------|
+| DBNet (ours)      | D910x1-MS1.9-G | ResNet-50     | ImageNet    | 81.70%      | 85.84%        | 83.72%      | 35                | [yaml](db_r50_icdar15.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50-db1df47a.ckpt) |
+| DBNet (PaddleOCR) | -              | ResNet50_vd   | SynthText   | 78.72%      | 86.41%        | 82.38%      | -                 | -                           | -                                                                                            |
+| DBNet++           | D910x1-MS1.9-G | ResNet-50     | ImageNet    | 82.02%      | 87.38%        | 84.62%      | -                 | -                           | -                                                                                            |
 
 </div>
 
+> DBNet++的详细信息即将发布，敬请期待。DBNet和DBNet++的唯一区别在于_Adaptive Scale Fusion_模块, 在yaml配置文件`neck`模块中的 `use_asf`参数进行设置。
+
 #### 注释：
 - 环境配置：训练的环境配置表示为 {处理器}x{处理器数量}-{MS模式}，其中 Mindspore 模式可以是 G-graph 模式或 F-pynative 模式。
 - DBNet的训练时长受数据处理部分和不同运行环境的影响非常大。
@@ -75,7 +78,7 @@ DBNet的整体架构图如图1所示，包含以下阶段:
 ...
 train:
   ckpt_save_dir: './tmp_det'
-  dataset_sink_mode: True
+  dataset_sink_mode: False
   dataset:
     type: DetDataset
     dataset_root: dir/to/dataset          <--- 更新
diff --git a/configs/rec/crnn/README.md b/configs/rec/crnn/README.md
@@ -37,19 +37,19 @@ According to our experiments, the evaluation results on public benchmark dataset
 
 <div align="center">
 
-| **Model** | **Context** | **Backbone** | **Avg Accuracy** | **Train T. (s/epoch)** | **Recipe** | **Download** | 
-|-----------|--------------|------------------|------------|--------------| ------ |------ |
-| CRNN (ours)    | D910x8-MS1.8-G | VGG7       | 82.03%    |  2445   | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)     | [weights](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt)     |
-| CRNN (ours)    | D910x8-MS1.8-G | ResNet34_vd   | 84.45%    |  2118    | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) |
-| CRNN (PaddleOCR) | - | ResNet34_vd | 83.99% | -| -| - |
+| **Model**        | **Context**    | **Backbone**  | **Avg Accuracy** | **Train T. (s/epoch)** | **Recipe**                                                                                     | **Download**                                                                               | 
+|------------------|----------------|---------------|------------------|------------------------|------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|
+| CRNN (ours)      | D910x8-MS1.8-G | VGG7          | 82.03%           | 2445                   | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_vgg7.yaml)     | [weights](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt)     |
+| CRNN (ours)      | D910x8-MS1.8-G | ResNet34_vd   | 84.45%           | 2118                   | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/crnn/crnn_resnet34.yaml) | [weights](https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_resnet34-83f37f07.ckpt) |
+| CRNN (PaddleOCR) | -              | ResNet34_vd   | 83.99%           | -                      | -                                                                                              | -                                                                                          |
 
 </div>
 
 **Notes:**
 - Context: Training context denoted as {device}x{pieces}-{MS mode}, where mindspore mode can be G-graph mode or F-pynative mode with ms function. For example, D910x8-MS1.8-G is for training on 8 pieces of Ascend 910 NPU using graph mode based on Minspore version 1.8.
 - To reproduce the result on other contexts, please ensure the global batch size is the same. 
 - Both VGG and ResNet models are trained from scratch without any pre-training.
-- The above models are trained with MJSynth (MJ) and SynthText (ST) datasets. For more data details, please refer to section 3.1.2 Dataset Preparation].
+- The above models are trained with MJSynth (MJ) and SynthText (ST) datasets. For more data details, please refer to [Dataset Preparation](#312-dataset-preparation) section.
 - **Evaluations are tested individually on each benchmark dataset, and Avg Accuracy is the average of accuracies across all sub-datasets.**
 - For the PaddleOCR version of CRNN, the performance is reported on the trained model provided on their [github](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_crnn_en.md).
 
@@ -62,15 +62,15 @@ Please refer to the [installation instruction](https://github.com/mindspore-lab/
 
 #### 3.1.2 Dataset Preparation
 Please download lmdb dataset for traininig and evaluation from  [here](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0) (ref: [deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here)). There're several zip files:
-- `data_lmdb_release.zip` contains the entire datasets including train, valid and evaluation.
-- `validation.zip` is the union dataset for Validation
+- `data_lmdb_release.zip` contains the **entire** datasets including training.zip, validation.zip and evaluation.zip.
+- `validation.zip` is the union dataset for Validation.
 - `evaluation.zip` contains several benchmarking datasets.
 
 Unzip the data and after preparation, the data structure should be like 
 
 ``` text
 .
-├── train
+├── training
 │   ├── MJ
 │   │   ├── data.mdb
 │   │   ├── lock.mdb
@@ -91,8 +91,8 @@ Unzip the data and after preparation, the data structure should be like
 ```
 
 #### 3.1.3 Check YAML Config Files
-Please check the following important args: `system.distribute`, `system.val_while_train`, `common.batch_size`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`, 
-`eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`. Explanations of these important args:
+Please check the following important args: `system.distribute`, `system.val_while_train`, `common.batch_size`, `train.ckpt_save_dir`, `train.dataset.dataset_root`, `train.dataset.data_dir`, `train.dataset.label_file`, 
+`eval.ckpt_load_path`, `eval.dataset.dataset_root`, `eval.dataset.data_dir`, `eval.dataset.label_file`, `eval.loader.batch_size`. Explanations of these important args:
 
 ```yaml
 system:
@@ -106,7 +106,7 @@ common:
   batch_size: &batch_size 64                                          # Batch size for training
 ...
 train:
-  ckpt_save_dir: './tmp_rec'
+  ckpt_save_dir: './tmp_rec'                                          # The training result (including checkpoints, per-epoch performance and curves) saving directory
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
@@ -115,7 +115,7 @@ train:
     # label_file:                                                     # Path of training label file, concatenated with `dataset_root` to be the complete path of training label file, not required when using LMDBDataset
 ...
 eval:
-  ckpt_load_path: './tmp_rec/best.ckpt'
+  ckpt_load_path: './tmp_rec/best.ckpt'                               # checkpoint file path
   dataset_sink_mode: False
   dataset:
     type: LMDBDataset
@@ -164,7 +164,7 @@ The training result (including checkpoints, per-epoch performance and curves) wi
 To evaluate the accuracy of the trained model, you can use `eval.py`. Please set the checkpoint path to the arg `ckpt_load_path` in the `eval` section of yaml config file, set `distribute` to be False, and then run:
 
 ```
-python tools/eval.py --config configs/rec/crnn/crnn_vgg7.yaml
+python tools/eval.py --config configs/rec/crnn/crnn_resnet34.yaml
 ```
 
 ## References
diff --git a/configs/rec/crnn/README_CN.md b/configs/rec/crnn/README_CN.md