-
-
Notifications
You must be signed in to change notification settings - Fork 16
External OCR Setup Help
GSM has its own OCR now, and a lot of care has gone into making the best offering possible. But maybe you still want to use another system's ocr...
You still can, but you will have to deal with a whole host of issues... Flooding Texthooker, Poorly Trimmed Audio, Inaccurate Screenshot Timing, etc....
Here are a few examples. It is unlikely that I provide more support than this.
All that's needed is to turn on the webserver in LunaTranslator. (SEE MORE ABOUT CONFIG FURTHER DOWN).
Since GSM relies on Text Events from voice lines, it helps if the text event is closely aligned to the beginning of the voice line, which is often the case when using some tools that hook into the game and pipe text to clipboard/websocket like Agent/Textractor.
However, this is not the case with some OCR systems, such as Kamui. The timing is not as consistent as a texthooker because it needs to trigger the OCR to read the current frame before sending the texts to clipboard. Fortunately, there are some settings in GSM that can help mitigate this issue.
Note: Make sure you have Clipboard set to ON in GSM settings and "Copy To Clipboard" enabled in Kamui.


This setting allows you to add extra space at the beginning of the audio to account for any delay in receiving the text event. You almost always want this to be negative. If the voice in your cards always starts in the middle of the line, add more delay. Also, try to scan as soon as the text appears if you can.

This setting allows the result from Voice Activation to trim the beginning of the audio. It is normally off by default, but could help in the case where you need a large beginning offset.

This is a workflow that can help with fine-tuning the resulting audio. I recommend using ocenaudio for this. I have a video out explaining and demoing this feature here
