Skip to content

Commit 75a9260

Browse files
committed
Update table_rec_v2 interface (#3608)
* Update table_rec_v2 interface * Update
1 parent f67e703 commit 75a9260

File tree

10 files changed

+173
-57
lines changed

10 files changed

+173
-57
lines changed

docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -870,7 +870,7 @@ In the above Python script, the following steps are executed:
870870
</tr>
871871
<td><code>use_table_cells_ocr_results</code></td>
872872
<td>Whether to enable Table-Cells-OCR mode, when not enabled, use global OCR result to fill to HTML table, when enabled, do OCR cell by cell and fill to HTML table (it will increase the time consuming). Both of them perform differently in different scenarios, please choose according to the actual situation.</td>
873-
<td><code>bool|False</code></td>
873+
<td><code>bool</code></td>
874874
<td>
875875
<ul>
876876
<li><b>bool</b>:<code>True</code> or <code>False</code>
@@ -1249,6 +1249,12 @@ Below are the API references for basic serving deployment and multi-language ser
12491249
<td>Please refer to the description of the <code>text_rec_score_thresh</code> parameter of the pipeline object's <code>predict</code> method.</td>
12501250
<td>No</td>
12511251
</tr>
1252+
<tr>
1253+
<td><code>useTableCellsOcrResults</code></td>
1254+
<td><code>boolean</code></td>
1255+
<td>Please refer to the description of the <code>use_table_cells_ocr_results</code> parameter of the pipeline object's <code>predict</code> method.</td>
1256+
<td>No</td>
1257+
</tr>
12521258
</tbody>
12531259
</table>
12541260

docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -815,7 +815,7 @@ for res in output:
815815
</tr>
816816
<td><code>use_table_cells_ocr_results</code></td>
817817
<td>是否启用单元格OCR模式,不启用时采用全局OCR结果填充至HTML表格,启用时逐个单元格做OCR并填充至HTML表格(会增加耗时)。二者在不同场景下性能不同,请根据实际情况选择。</td>
818-
<td><code>bool|False</code></td>
818+
<td><code>bool</code></td>
819819
<td>
820820
<ul>
821821
<li><b>bool</b>:<code>True</code> 或者 <code>False</code>
@@ -1194,6 +1194,12 @@ for res in output:
11941194
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>text_rec_score_thresh</code> 参数相关说明。</td>
11951195
<td>否</td>
11961196
</tr>
1197+
<tr>
1198+
<td><code>useTableCellsOcrResults</code></td>
1199+
<td><code>boolean</code></td>
1200+
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_table_cells_ocr_results</code> 参数相关说明。</td>
1201+
<td>否</td>
1202+
</tr>
11971203
</tbody>
11981204
</table>
11991205
<ul>

docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.en.md

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -893,23 +893,23 @@ In the above Python script, the following steps are executed:
893893
</tr>
894894
<td><code>use_table_cells_ocr_results</code></td>
895895
<td>Whether to enable Table-Cells-OCR mode, when not enabled, use global OCR result to fill to HTML table, when enabled, do OCR cell by cell and fill to HTML table (it will increase the time consuming). Both of them perform differently in different scenarios, please choose according to the actual situation.</td>
896-
<td><code>bool|False</code></td>
896+
<td><code>bool</code></td>
897897
<td>
898898
<ul>
899899
<li><b>bool</b>:<code>True</code> or <code>False</code>
900900
<td><code>False</code></td>
901901
</tr>
902902
<td><code>use_e2e_wired_table_rec_model</code></td>
903903
<td>Whether to enable the wired table end-to-end prediction mode, when not enabled, using the table cells detection model prediction results filled to the HTML table, when enabled, using the end-to-end table structure recognition model cell prediction results filled to the HTML table. Both of them have different performance in different scenarios, please choose according to the actual situation.</td>
904-
<td><code>bool|False</code></td>
904+
<td><code>bool</code></td>
905905
<td>
906906
<ul>
907907
<li><b>bool</b>:<code>True</code> or <code>False</code>
908908
<td><code>False</code></td>
909909
</tr>
910910
<td><code>use_e2e_wireless_table_rec_model</code></td>
911911
<td>Whether to enable the wireless table end-to-end prediction mode, when not enabled, using the table cells detection model prediction results filled to the HTML table, when enabled, using the end-to-end table structure recognition model cell prediction results filled to the HTML table. Both of them have different performance in different scenarios, please choose according to the actual situation.</td>
912-
<td><code>bool|False</code></td>
912+
<td><code>bool</code></td>
913913
<td>
914914
<ul>
915915
<li><b>bool</b>:<code>True</code> or <code>False</code>
@@ -1322,6 +1322,24 @@ Below are the API references for basic serving deployment and multi-language ser
13221322
<td>Please refer to the description of the <code>text_rec_score_thresh</code> parameter of the pipeline object's <code>predict</code> method.</td>
13231323
<td>No</td>
13241324
</tr>
1325+
<tr>
1326+
<td><code>useTableCellsOcrResults</code></td>
1327+
<td><code>boolean</code></td>
1328+
<td>Please refer to the description of the <code>use_table_cells_ocr_results</code> parameter of the pipeline object's <code>predict</code> method.</td>
1329+
<td>No</td>
1330+
</tr>
1331+
<tr>
1332+
<td><code>useE2eWiredTableRecModel</code></td>
1333+
<td><code>boolean</code></td>
1334+
<td>Please refer to the description of the <code>use_e2e_wired_table_rec_model</code> parameter of the pipeline object's <code>predict</code> method.</td>
1335+
<td>No</td>
1336+
</tr>
1337+
<tr>
1338+
<td><code>useE2eWirelessTableRecModel</code></td>
1339+
<td><code>boolean</code></td>
1340+
<td>Please refer to the description of the <code>use_e2e_wireless_table_rec_model</code> parameter of the pipeline object's <code>predict</code> method.</td>
1341+
<td>No</td>
1342+
</tr>
13251343
</tbody>
13261344
</table>
13271345
<p>Each element in <code>tableRecResults</code> is an <code>object</code> with the following properties:</p>

docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.md

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -896,23 +896,23 @@ for res in output:
896896
</tr>
897897
<td><code>use_table_cells_ocr_results</code></td>
898898
<td>是否启用单元格OCR模式,不启用时采用全局OCR结果填充至HTML表格,启用时逐个单元格做OCR并填充至HTML表格(会增加耗时)。二者在不同场景下性能不同,请根据实际情况选择。</td>
899-
<td><code>bool|False</code></td>
899+
<td><code>bool</code></td>
900900
<td>
901901
<ul>
902902
<li><b>bool</b>:<code>True</code> 或者 <code>False</code>
903903
<td><code>False</code></td>
904904
</tr>
905905
<td><code>use_e2e_wired_table_rec_model</code></td>
906906
<td>是否启用有线表格端到端预测模式,不启用时采用表格单元格检测模型预测结果填充至HTML表格,启用时采用端到端表格结构识别模型的单元格预测结果填充至HTML表格。二者在不同场景下性能不同,请根据实际情况选择。</td>
907-
<td><code>bool|False</code></td>
907+
<td><code>bool</code></td>
908908
<td>
909909
<ul>
910910
<li><b>bool</b>:<code>True</code> 或者 <code>False</code>
911911
<td><code>False</code></td>
912912
</tr>
913913
<td><code>use_e2e_wireless_table_rec_model</code></td>
914914
<td>是否启用无线表格端到端预测模式,不启用时采用表格单元格检测模型预测结果填充至HTML表格,启用时采用端到端表格结构识别模型的单元格预测结果填充至HTML表格。二者在不同场景下性能不同,请根据实际情况选择。</td>
915-
<td><code>bool|False</code></td>
915+
<td><code>bool</code></td>
916916
<td>
917917
<ul>
918918
<li><b>bool</b>:<code>True</code> 或者 <code>False</code>
@@ -1326,6 +1326,24 @@ for res in output:
13261326
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>text_rec_score_thresh</code> 参数相关说明。</td>
13271327
<td>否</td>
13281328
</tr>
1329+
<tr>
1330+
<td><code>useTableCellsOcrResults</code></td>
1331+
<td><code>boolean</code></td>
1332+
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_table_cells_ocr_results</code> 参数相关说明。</td>
1333+
<td>否</td>
1334+
</tr>
1335+
<tr>
1336+
<td><code>useE2eWiredTableRecModel</code></td>
1337+
<td><code>boolean</code></td>
1338+
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_e2e_wired_table_rec_model</code> 参数相关说明。</td>
1339+
<td>否</td>
1340+
</tr>
1341+
<tr>
1342+
<td><code>useE2eWirelessTableRecModel</code></td>
1343+
<td><code>boolean</code></td>
1344+
<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_e2e_wireless_table_rec_model</code> 参数相关说明。</td>
1345+
<td>否</td>
1346+
</tr>
13291347
</tbody>
13301348
</table>
13311349
<p><code>tableRecResults</code>中的每个元素为一个<code>object</code>,具有如下属性:</p>

paddlex/inference/pipelines/table_recognition/pipeline.py

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ def predict_doc_preprocessor_res(
217217
doc_preprocessor_res = {}
218218
doc_preprocessor_image = image_array
219219
return doc_preprocessor_res, doc_preprocessor_image
220-
220+
221221
def split_ocr_bboxes_by_table_cells(self, ori_img, cells_bboxes):
222222
"""
223223
Splits OCR bounding boxes by table cells and retrieves text.
@@ -241,7 +241,7 @@ def split_ocr_bboxes_by_table_cells(self, ori_img, cells_bboxes):
241241
# Perform OCR on the defined region of the image and get the recognized text.
242242
rec_te = next(self.general_ocr_pipeline(ori_img[y1:y2, x1:x2, :]))
243243
# Concatenate the texts and append them to the texts_list.
244-
texts_list.append(''.join(rec_te["rec_texts"]))
244+
texts_list.append("".join(rec_te["rec_texts"]))
245245
# Return the list of recognized texts from each cell.
246246
return texts_list
247247

@@ -270,9 +270,15 @@ def predict_single_table_recognition_res(
270270
"""
271271
table_structure_pred = next(self.table_structure_model(image_array))
272272
if use_table_cells_ocr_results == True:
273-
table_cells_result = list(map(lambda arr: arr.tolist(), table_structure_pred['bbox']))
274-
table_cells_result = [[rect[0], rect[1], rect[4], rect[5]] for rect in table_cells_result]
275-
cells_texts_list = self.split_ocr_bboxes_by_table_cells(image_array, table_cells_result)
273+
table_cells_result = list(
274+
map(lambda arr: arr.tolist(), table_structure_pred["bbox"])
275+
)
276+
table_cells_result = [
277+
[rect[0], rect[1], rect[4], rect[5]] for rect in table_cells_result
278+
]
279+
cells_texts_list = self.split_ocr_bboxes_by_table_cells(
280+
image_array, table_cells_result
281+
)
276282
else:
277283
cells_texts_list = []
278284
single_table_recognition_res = get_table_recognition_res(
@@ -309,7 +315,7 @@ def predict(
309315
text_det_box_thresh: Optional[float] = None,
310316
text_det_unclip_ratio: Optional[float] = None,
311317
text_rec_score_thresh: Optional[float] = None,
312-
use_table_cells_ocr_results: Optional[bool] = False,
318+
use_table_cells_ocr_results: bool = False,
313319
cell_sort_by_y_projection: Optional[bool] = None,
314320
**kwargs,
315321
) -> TableRecognitionResult:

0 commit comments

Comments
 (0)