How to improve results of KIE SER and RE model #13580
Replies: 5 comments 9 replies
-
What do you think could be the problem here? and how can I improve model accuracy? Any suggestions/advice would be very helpful! Thank you in advance. |
Beta Was this translation helpful? Give feedback.
-
The most direct way is to increase the size of the training dataset. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
I still do not understand why SER model outputs Chinese characters even though I have specified already the rec_model_dir path to the model that was fine-tuned to German language (using latin_dict.txt of paddleocr). The weights for the detection model works though. When I try adding the variables kie_det_model_dir and kie_rec_model_dir to the config file, the output just returns the image itself and no predictions at all. I tried performing inference and prediction: python tools/infer_kie_token_ser.py -c ./configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh_udml.yml -o Architecture.Backbone.checkpoints=./output/ser_vi_layoutxlm_xfund_zh_udml/best_accuracy Global.infer_img="./dataset/v5_dataset/detection/v5_img_det_eval/Impfdossier_geschwaerzt_(61)_page_1.jpeg" Global.kie_det_model_dir=./inference/det/det_resnet50_v6_1000_infer Global.kie_rec_model_dir=./inference/rec/latin_ppocrv3_rec_1000_infer python ./ppstructure/kie/predict_kie_token_ser.py --kie_algorithm=LayoutXLM --ser_model_dir=./inference/ser/ser_vi_layoutxlm_xfund_zh_udml_exp1_bestacc_infer/Teacher --ser_dict_path=./dataset/kie_2classes/class_list_re.txt --det_model_dir=./inference/det/det_resnet50_v6_1000_infer --rec_model_dir=./inference/rec/latin_ppocrv3_rec_1000_infer --vis_font_path="./doc/fonts/german.ttf" --use_gpu=True --rec_char_dict_path="./ppocr/utils/dict/latin_dict.txt" --image_dir="./dataset/v5_dataset/detection/v5_img_det_eval/Impfdossier_geschwaerzt_(61)_page_1.jpeg" Also, the result for the inference and prediction are also different. Is this normal? Any ideas why? @SWHL @UserWangZz |
Beta Was this translation helpful? Give feedback.
-
Did you find a good solution for your problem? i'm facing the same thing @piarosebelledelapaz |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I have been experimenting with PaddleOCR for months now and I have fine-tuned the detection and recognition model accordingly to my custom dataset. However, it seems like SER and RE model is not performing well... I would like to ask what could be done to improve the performance of the two. I have 130 samples to train these two models, and I trained them for 100 epochs
I want to link vaccine names to its corresponding vaccine dates. The object detection and text recognition are performing good already to detect the entities I want from the image, below is a sample result:
My sample annotation for SER and RE:

Here are sample results from SER and RE model which is not that good:
and the transcription for infer.txt contains Chinese characters, not alphabet even though I use inference rec model from my fine-tuned recognition model, --vis_font_path="./doc/fonts/german.ttf" and rec_char_dict_path="./ppocr/utils/dict/latin_dict.txt" as part of the parameters.

Beta Was this translation helpful? Give feedback.
All reactions