A Docker image built through Github Actions with Git commit version tag
I found that Marker's Docker image is difficult to find. The code on GitHub does not provide a pre-built Docker image.
After reviewing the following items
This project will use GitHub Actions and Docker Hub to build and publish images, aiming to keep the process as clean as possible without custom configuration files.
The images of this project will be published to Docker Hub under the repository xiaoyao9184/marker.
Since this project references the Marker project via a submodule, it cannot monitor push events on the Marker project, and therefore cannot automatically create an image for every commit. A good solution is to manually trigger the action and tag it with the commit id. For more details, see this article set-dynamic-parameters-github-workflows-en.
The default image name format is ${DOCKERHUB_USERNAME}/marker
.
The tag uses the input parameter commit_id
,
which can be either a branch name or a commit id,
when manually triggering the docker-image-tag-commit job.
if the job is triggered by a submodule update push,
the default branch name master
will be used instead of the commit_id
parameter.
This job will also use the shortened commit id as the tag.
If the job docker-image-tag-version is triggered with the marker_version
parameter set to the PyPI Marker version number,
the Marker package published on PyPI will be installed for the build,
and marker_version
will be used as the tag.
Currently, only the linux/amd64
platform is supported.
The models of this project will be synced to HuggingFace under the collection xiaoyao9184/surya-and-marker.
The Docker image does not include model files. When running, the required models will be automatically downloaded.
If you need to run offline, you must pre-download the model files and enable offline mode. See cache/README.md for detailed instructions.
By default, the Docker container runs the Streamlit App, which comes from the original project.
However, this project also provides a Gradio App, a functional reimplementation of the Streamlit version. The Gradio App supports both a UI and API interface, and can even serve as an MCP server, so it is recommended as the preferred option.
The source code for the Gradio App is located in the gradio directory of this project. A demo of this project is also available and auto-synced on Hugging Face Spaces: xiaoyao9184/marker
To run the Gradio App, you can do so by modifying the Docker command. see the up.gradio
sub-directory in the docker directory for details.
You can fork this project and build your own image. You will need to provide the following variables: DOCKERHUB_USERNAME
, DOCKERHUB_TOKEN
, HF_USERNAME
, HF_TOKEN
.
See this for more details.