Thoughts on C++ version of PaddleOCR #9336
Closed
timminator
started this conversation in
Ideas
Replies: 2 comments
-
Nice work! I guess if the development of PaddleOCR is in Python (is it?) and there's so little speed difference, staying with the Python version seem good to me. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Okay. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
@niksedk: Your message last week with the link to a C++ version of PaddleOCR, resulted in me spending the whole week on this.
I've ran into a few issues, but now I got it all worked out I think.
The standard version when you compile it from source does not log any results until all images are processed and then only prints the results.
I did not like this so I modified it to print the results right away. The output also looked quite different then the one from the python version, so I modified that aswell to be pretty close to the output from the python version.
I've done some testing on my main PC on 250 subtitles again. Performance wise it is not much better then the Standalone I provided before:
(Note: Fix for Batch mode for the python version manually applied)
But the download size and unpacked size for the CPU version is for example way smaller (53MB vs 181MB packed, 314MB vs 896MB unpacked). For the GPU version there is not that much of a size difference.
The C++ version has quite a few quirks, for example it has no language flag, you need to specify the dictionary aswell. And to achieve good performance on the CPU I had to run it with the flag "enable_mkldnn"...
I've also noticed very slight differences in the results when using Python and the C++ version. I opened an issue for that here but I got no response until now.
I've also tried the C++ version with Subtitle Edit (thats where I noticed the errors in the error logging) and it would work without to much changes I think. Just keeping compatibility with the python version would be not so easy, as they use different command arguments.
First of all I wanted to inform you, that yes, there is also a working C++ version. Secondly I wanted to ask you what you think of this.
We could switch to using this version exclusively, we could support it next to the python one and call it PaddleOCR C++...
I'm just asking what your thoughts are on this.
You can check the C++ version out in my fork of PaddleOCR here.
Beta Was this translation helpful? Give feedback.
All reactions