How to improve results of KIE SER and RE model #13580

piarosebelledelapaz · 2024-08-02T10:33:37Z

piarosebelledelapaz
Aug 2, 2024

Hello, I have been experimenting with PaddleOCR for months now and I have fine-tuned the detection and recognition model accordingly to my custom dataset. However, it seems like SER and RE model is not performing well... I would like to ask what could be done to improve the performance of the two. I have 130 samples to train these two models, and I trained them for 100 epochs

I want to link vaccine names to its corresponding vaccine dates. The object detection and text recognition are performing good already to detect the entities I want from the image, below is a sample result:

My sample annotation for SER and RE:

Here are sample results from SER and RE model which is not that good:

and the transcription for infer.txt contains Chinese characters, not alphabet even though I use inference rec model from my fine-tuned recognition model, --vis_font_path="./doc/fonts/german.ttf" and rec_char_dict_path="./ppocr/utils/dict/latin_dict.txt" as part of the parameters.

piarosebelledelapaz · 2024-08-02T10:36:25Z

piarosebelledelapaz
Aug 2, 2024
Author

What do you think could be the problem here? and how can I improve model accuracy?

Any suggestions/advice would be very helpful! Thank you in advance.

0 replies

SWHL · 2024-08-04T02:41:13Z

SWHL
Aug 4, 2024
Maintainer

The most direct way is to increase the size of the training dataset.
You are using too little data

7 replies

piarosebelledelapaz Aug 4, 2024
Author

Okay, thank you for the response.

Just one last question, do you know if RE model can extract linkings for multiple categories, or would it be too complex for the model?

linkings for example:
vaccine name -> vaccine date
vaccine date -> vaccine provider

so technically forming transitive linking pairs. is this possible?

SWHL Aug 4, 2024
Maintainer

To be honest, I haven't really trained the RE model here because I think its generalization is a bit poor.
It is recommended to read more relevant papers before starting your work to see if anyone has tried something similar to yours here.

If conditions permit, you can also try combining with LLM to easily extract the desired relationships using prompts.

I hope it can help you.

Related docs:

docs

piarosebelledelapaz Aug 4, 2024
Author

Thank you for your honest feedback and suggestions.

Also another issue I've encountered I mentioned above is that the transcription for infer.txt contains Chinese characters, not alphabet even though I use inference rec model from my fine-tuned recognition model which is trained with latin characters using the latin.dict provided by paddleocr, --vis_font_path="./doc/fonts/german.ttf" and rec_char_dict_path="./ppocr/utils/dict/latin_dict.txt" as part of the parameters.

Do you know why this happens?

SWHL Aug 4, 2024
Maintainer

I guess there must be something missing, you can debug it in more detail.

piarosebelledelapaz Aug 4, 2024
Author

hmm okie will do! thx again for your quick responses

will not close this discussion yet incase others want to share their advice/suggestions when it comes to fine-tuning SER & RE model using custom dataset

UserWangZz · 2024-08-05T02:36:47Z

UserWangZz
Aug 5, 2024
Collaborator

First of all, PPOCRLabel cannot label RE tasks. You can refer to the documentation to view the labeling of RE tasks;
From the annotation file you provided, it seems that there are only annotations for the SER task, which also explains why the experiment did not perform well on the RE task;
Therefore, you need to find a way to add the link field of the associated entity to the annotation file. For the specific format, please refer to the format in the document.

1 reply

piarosebelledelapaz Aug 5, 2024
Author

Thanks for your response. I have labeled the RE and SER dataset accordingly to the documentation by myself. I assigned them ID values, as well as corresponding linking values. Example of my annotation can be found below:

Impf_page_2.jpeg [{"transcription": "11.10.96", "points": [[723, 355], [903, 346], [905, 395], [725, 404]], "label": "vaccine_date", "id": 1, "linking": [[2, 1]]}, {"transcription": "Poloral Berna trivalent", "points": [[406, 383], [672, 363], [680, 465], [415, 483]], "label": "vaccine_name", "id": 2, "linking": [[2, 1],[2, 3],[2, 4]]}, {"transcription": "15.1.91", "points": [[724, 403], [913, 386], [919, 445], [729, 463]], "label": "vaccine_date", "id": 3, "linking": [[2, 3]]}, {"transcription": "27.3.91", "points": [[721, 443], [896, 418], [903, 468], [728, 493]], "label": "vaccine_date", "id": 4, "linking": [[2, 4]]}]

The same annotation file has been trained for both SER and RE model, and the results from training are the samples I have provided above.

piarosebelledelapaz · 2024-09-04T09:15:58Z

piarosebelledelapaz
Sep 4, 2024
Author

I still do not understand why SER model outputs Chinese characters even though I have specified already the rec_model_dir path to the model that was fine-tuned to German language (using latin_dict.txt of paddleocr). The weights for the detection model works though. When I try adding the variables kie_det_model_dir and kie_rec_model_dir to the config file, the output just returns the image itself and no predictions at all.

I tried performing inference and prediction:

python tools/infer_kie_token_ser.py -c ./configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh_udml.yml -o Architecture.Backbone.checkpoints=./output/ser_vi_layoutxlm_xfund_zh_udml/best_accuracy Global.infer_img="./dataset/v5_dataset/detection/v5_img_det_eval/Impfdossier_geschwaerzt_(61)_page_1.jpeg" Global.kie_det_model_dir=./inference/det/det_resnet50_v6_1000_infer Global.kie_rec_model_dir=./inference/rec/latin_ppocrv3_rec_1000_infer

python ./ppstructure/kie/predict_kie_token_ser.py --kie_algorithm=LayoutXLM --ser_model_dir=./inference/ser/ser_vi_layoutxlm_xfund_zh_udml_exp1_bestacc_infer/Teacher --ser_dict_path=./dataset/kie_2classes/class_list_re.txt --det_model_dir=./inference/det/det_resnet50_v6_1000_infer --rec_model_dir=./inference/rec/latin_ppocrv3_rec_1000_infer --vis_font_path="./doc/fonts/german.ttf" --use_gpu=True --rec_char_dict_path="./ppocr/utils/dict/latin_dict.txt" --image_dir="./dataset/v5_dataset/detection/v5_img_det_eval/Impfdossier_geschwaerzt_(61)_page_1.jpeg"

Also, the result for the inference and prediction are also different. Is this normal?

Any ideas why? @SWHL @UserWangZz

1 reply

UserWangZz Sep 5, 2024
Collaborator

First of all, we do not recommend continuing to use this method for Kie tasks. This is an outdated algorithm. The more mainstream method currently is to use multi-modal large models to extract key information.
Regarding the question you mentioned, the current reasonable guess is that it is necessary to check whether the detection model and recognition model are loaded normally, and debugging is needed to check the correctness of the model output. Finally, regarding the different results of inference and prediction, can you provide a simple image example?

aymennasri · 2025-07-27T18:49:31Z

aymennasri
Jul 27, 2025

Did you find a good solution for your problem? i'm facing the same thing @piarosebelledelapaz

0 replies

How to improve results of KIE SER and RE model #13580

Uh oh!

piarosebelledelapaz Aug 2, 2024

Replies: 5 comments · 9 replies

Uh oh!

piarosebelledelapaz Aug 2, 2024 Author

Uh oh!

SWHL Aug 4, 2024 Maintainer

Uh oh!

piarosebelledelapaz Aug 4, 2024 Author

Uh oh!

SWHL Aug 4, 2024 Maintainer

Uh oh!

piarosebelledelapaz Aug 4, 2024 Author

Uh oh!

SWHL Aug 4, 2024 Maintainer

Uh oh!

piarosebelledelapaz Aug 4, 2024 Author

Uh oh!

UserWangZz Aug 5, 2024 Collaborator

Uh oh!

piarosebelledelapaz Aug 5, 2024 Author

Uh oh!

Uh oh!

piarosebelledelapaz Sep 4, 2024 Author

Uh oh!

UserWangZz Sep 5, 2024 Collaborator

Uh oh!

Uh oh!

aymennasri Jul 27, 2025

piarosebelledelapaz
Aug 2, 2024

Replies: 5 comments 9 replies

piarosebelledelapaz
Aug 2, 2024
Author

SWHL
Aug 4, 2024
Maintainer

piarosebelledelapaz Aug 4, 2024
Author

SWHL Aug 4, 2024
Maintainer

piarosebelledelapaz Aug 4, 2024
Author

SWHL Aug 4, 2024
Maintainer

piarosebelledelapaz Aug 4, 2024
Author

UserWangZz
Aug 5, 2024
Collaborator

piarosebelledelapaz Aug 5, 2024
Author

piarosebelledelapaz
Sep 4, 2024
Author

UserWangZz Sep 5, 2024
Collaborator

aymennasri
Jul 27, 2025