You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md
+15-8Lines changed: 15 additions & 8 deletions
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ comments: true
7
7
## 1. Introduction to General Table Recognition Pipeline
8
8
Table recognition is a technology that automatically identifies and extracts table content and structure from documents or images. It is widely used in data entry, information retrieval, and document analysis. By using computer vision and machine learning algorithms, table recognition can convert complex table information into editable formats, facilitating further processing and analysis of data.
9
9
10
-
The General Table Recognition Pipeline is designed to solve table recognition tasks by identifying tables in images and outputting them in HTML format. This pipeline integrates the well-known SLANet and SLANet_plus table recognition models. Based on this pipeline, precise predictions of tables can be achieved, covering a wide range of applications in general, manufacturing, finance, transportation, and other fields. The pipeline also provides flexible service deployment options, supporting various hardware and programming languages for integration. Moreover, it offers custom development capabilities, allowing you to train and optimize models on your own dataset, which can then be seamlessly integrated.
10
+
The General Table Recognition Pipeline is designed to solve table recognition tasks by identifying tables in images and outputting them in HTML format. This pipeline integrates the well-known SLANet and SLANet_plus table structure recognition models. Based on this pipeline, precise predictions of tables can be achieved, covering a wide range of applications in general, manufacturing, finance, transportation, and other fields. The pipeline also provides flexible service deployment options, supporting various hardware and programming languages for integration. Moreover, it offers custom development capabilities, allowing you to train and optimize models on your own dataset, which can then be seamlessly integrated.
<b>The General Table Recognition Pipeline includes essential modules for table structure recognition, text detection, and text recognition, as well as optional modules for layout area detection, document image orientation classification, and text image correction.</b>
@@ -16,7 +16,7 @@ The General Table Recognition Pipeline is designed to solve table recognition ta
@@ -868,6 +868,13 @@ In the above Python script, the following steps are executed:
868
868
</td>
869
869
<td><code>None</code></td>
870
870
</tr>
871
+
<td><code>use_table_cells_ocr_results</code></td>
872
+
<td>Whether to enable Table-Cells-OCR mode, when not enabled, use global OCR result to fill to HTML table, when enabled, do OCR cell by cell and fill to HTML table (it will increase the time consuming). Both of them perform differently in different scenarios, please choose according to the actual situation.</td>
873
+
<td><code>bool|False</code></td>
874
+
<td>
875
+
<ul>
876
+
<li><b>bool</b>:<code>True</code> or <code>False</code>
877
+
<td><code>False</code></td>
871
878
</table>
872
879
873
880
(3) Process the prediction results. Each sample's prediction result is represented as a corresponding Result object, and supports operations such as printing, saving as an image, saving as an `xlsx` file, saving as an `HTML` file, and saving as a `json` file.
@@ -1390,12 +1397,12 @@ SubModules:
1390
1397
LayoutDetection:
1391
1398
module_name: layout_detection
1392
1399
model_name: PicoDet_layout_1x_table
1393
-
model_dir: null #替换为微调后的版面区域检测模型权重路径
1400
+
model_dir: null #Replace with fine-tuned model weight paths
1394
1401
1395
1402
TableStructureRecognition:
1396
1403
module_name: table_structure_recognition
1397
1404
model_name: SLANet_plus
1398
-
model_dir: null #替换为微调后的表格结构识别模型权重路径
1405
+
model_dir: null #Replace with fine-tuned model weight paths
1399
1406
1400
1407
SubPipelines:
1401
1408
DocPreprocessor:
@@ -1406,7 +1413,7 @@ SubPipelines:
1406
1413
DocOrientationClassify:
1407
1414
module_name: doc_text_orientation
1408
1415
model_name: PP-LCNet_x1_0_doc_ori
1409
-
model_dir: null #替换为微调后的文档图像方向分类模型权重路径
1416
+
model_dir: null #Replace with fine-tuned model weight paths
1410
1417
1411
1418
DocUnwarping:
1412
1419
module_name: image_unwarping
@@ -1422,16 +1429,16 @@ SubPipelines:
1422
1429
TextDetection:
1423
1430
module_name: text_detection
1424
1431
model_name: PP-OCRv4_server_det
1425
-
model_dir: null #替换为微调后的文本检测模型权重路径
1432
+
model_dir: null #Replace with fine-tuned model weight paths
1426
1433
limit_side_len: 960
1427
1434
limit_type: max
1428
1435
thresh: 0.3
1429
-
box_thresh: 0.6
1436
+
box_thresh: 0.4
1430
1437
unclip_ratio: 2.0
1431
1438
TextRecognition:
1432
1439
module_name: text_recognition
1433
1440
model_name: PP-OCRv4_server_rec
1434
-
model_dir: null #替换为微调后文本识别的模型权重路径
1441
+
model_dir: null #Replace with fine-tuned model weight paths
Copy file name to clipboardExpand all lines: docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.en.md
+45-6Lines changed: 45 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -7,9 +7,7 @@ comments: true
7
7
## 1. Introduction to General Table Recognition v2 Pipeline
8
8
Table recognition is a technology that automatically identifies and extracts table content and its structure from documents or images. It is widely used in data entry, information retrieval, and document analysis. By using computer vision and machine learning algorithms, table recognition can convert complex table information into an editable format, making it easier for users to further process and analyze data.
9
9
10
-
The General Table Recognition v2 Pipeline(PP-TableMagic) is designed to solve table recognition tasks by identifying tables in images and outputting them in HTML format. Unlike the General Table Recognition Pipeline, this pipeline introduces two additional modules: table classification and table cell detection, which are linked with the table structure recognition module to complete the table recognition task. This pipeline can achieve accurate table predictions and is applicable in various fields such as general, manufacturing, finance, and transportation. It also provides flexible service deployment options, supporting multiple programming languages on various hardware. Additionally, it offers custom development capabilities, allowing you to train and fine-tune models on your own dataset, with seamless integration of the trained models.
11
-
12
-
<b>❗ The General Table Recognition v2 Pipeline is still being optimized and the final version will be released in the next version of PaddleX. In order to maintain the stability of use, you can use the General Table Recognition Pipeline for table processing first, and we will release a notice when the final version of v2 is open-sourced, so please stay tuned!</b>
10
+
The General Table Recognition v2 Pipeline (PP-TableMagic) is designed to solve table recognition tasks by identifying tables in images and outputting them in HTML format. Unlike the General Table Recognition Pipeline, this pipeline introduces two additional modules: table classification and table cell detection, which are linked with the table structure recognition module to complete the table recognition task. This pipeline can achieve accurate table predictions and is applicable in various fields such as general, manufacturing, finance, and transportation. It also provides flexible service deployment options, supporting multiple programming languages on various hardware. Additionally, it offers custom development capabilities, allowing you to train and fine-tune models on your own dataset, with seamless integration of the trained models. <b> In addition, the General Table Recognition v2 Pipeline also supports the use of end-to-end table structure recognition models (e.g. SLANet, SLANet_plus, etc.), and supports independent configuration of table recognition for wired and wireless table, allowing developers to freely select and combine the best table recognition solutions.</b>
@@ -894,14 +892,55 @@ In the above Python script, the following steps are executed:
894
892
<td><code>None</code></td>
895
893
</tr>
896
894
<td><code>use_table_cells_ocr_results</code></td>
897
-
<td>Whether to enable Table-Cells-OCR mode, when not enabled, use global OCR result to fill to html table, when enabled, do OCR cell by cell and fill to html table. Both of them perform differently in different scenarios, please choose according to the actual situation.</td>
895
+
<td>Whether to enable Table-Cells-OCR mode, when not enabled, use global OCR result to fill to HTML table, when enabled, do OCR cell by cell and fill to HTML table (it will increase the time consuming). Both of them perform differently in different scenarios, please choose according to the actual situation.</td>
896
+
<td><code>bool|False</code></td>
897
+
<td>
898
+
<ul>
899
+
<li><b>bool</b>:<code>True</code> or <code>False</code>
<td>Whether to enable the wired table end-to-end prediction mode, when not enabled, using the table cells detection model prediction results filled to the HTML table, when enabled, using the end-to-end table structure recognition model cell prediction results filled to the HTML table. Both of them have different performance in different scenarios, please choose according to the actual situation.</td>
904
+
<td><code>bool|False</code></td>
905
+
<td>
906
+
<ul>
907
+
<li><b>bool</b>:<code>True</code> or <code>False</code>
<td>Whether to enable the wireless table end-to-end prediction mode, when not enabled, using the table cells detection model prediction results filled to the HTML table, when enabled, using the end-to-end table structure recognition model cell prediction results filled to the HTML table. Both of them have different performance in different scenarios, please choose according to the actual situation.</td>
898
912
<td><code>bool|False</code></td>
899
913
<td>
900
914
<ul>
901
915
<li><b>bool</b>:<code>True</code> or <code>False</code>
902
916
<td><code>False</code></td>
903
917
</table>
904
918
919
+
<b>If you need to use the end-to-end table structure recognition model, just replace the corresponding table structure recognition model with the end-to-end table structure recognition model in the pipeline config file, and then load the modified config file and modify the corresponding `predict()` method parameter</b>. For example, if you need to use SLANet_plus to do end-to-end table recognition for wireless tables, just replace `model_name` with SLANet_plus in `WirelessTableStructureRecognition` in the config file (as shown below) and specify `use_e2e_ wireless_table_rec_model=True` in the prediction, the rest of the parts do not need to be modified, at this time the wireless table cells detection model will not take effect, but directly use SLANet_plus for end-to-end table recognition.
920
+
921
+
```yaml
922
+
SubModules:
923
+
WiredTableStructureRecognition:
924
+
module_name: table_structure_recognition
925
+
model_name: SLANeXt_wired
926
+
model_dir: null
927
+
928
+
WirelessTableStructureRecognition:
929
+
module_name: table_structure_recognition
930
+
model_name: SLANet_plus # Replace with the end-to-end table structure recognition model
931
+
model_dir: null
932
+
933
+
WiredTableCellsDetection:
934
+
module_name: table_cells_detection
935
+
model_name: RT-DETR-L_wired_table_cell_det
936
+
model_dir: null
937
+
938
+
WirelessTableCellsDetection:
939
+
module_name: table_cells_detection
940
+
model_name: RT-DETR-L_wireless_table_cell_det
941
+
model_dir: null
942
+
```
943
+
905
944
(3) Process the prediction results, where each sample's prediction result is represented as a corresponding Result object, and supports operations such as printing, saving as an image, saving as an `xlsx` file, saving as an `HTML` file, and saving as a `json` file:
0 commit comments