A tool for recognizing text in a video file and storing it in a SQLite3 database.
- Grab a binary from the releases page, or build it yourself.
- Download the Tesseract OCR data
-
Clone the repository
-
Make a build directory and cd into it
mkdir -p build/release && cd build/release
- Install the dependencies
conan install ../.. --build=missing -s build_type=Release
Note that building on Apple platforms with VisionKit support requires a generator different from "Unix Makefiles".
You can select a generator by passing -c tools.cmake.cmaketoolchain:generator=generator
argument, for example:
conan install ../.. --build=missing -s build_type=Release -c tools.cmake.cmaketoolchain:generator=Ninja
- Build the project
conan build ../..
You will find the orc-suite
binary in the build directory.
Navigate into your build folder:
cd build/release
Run conan with a temporary lock file:
conan build -s:a compiler.cppstd=17 --build=missing -s build_type=Release --lockfile-partial --lockfile-out=tmp.lock .
And then compare the tmp.lock
file with the conan.lock
file in the project root.
NOTE: On Windows building ffmpeg from sources sometimes fails in the CI because of the MSYS2 version mismatch, I'm too lazy to investigate this any further so at the moment I'm just uploading ffmpeg binaries from my dev machine.
You can use the ocs-watcher
helper script to run the OCR Suite on all video files in a directory.
It will automatically detect when a new file is added and run the OCR Suite on it, as well as run
OCR Suite periodically on all files in the directory, thus keeping the database up to date for a video file,
currently being recorded:
Fill out your configuration file and run the watcher (see tools/ocs-watcher/tests/dummy-config.toml for example):
uv run ocs-watcher -c config.toml
NOTE: On MacOS when using the VisionKit OCR provider, there is no point in spawning multiple threads, the VisionKit processes all the requests from all the threads sequentially anyway.