A Python SDK for the Datalab API - a document intelligence platform powered by marker and surya.
See the full documentation at https://documentation.datalab.to.
pip install datalab-python-sdk
Get your API key from https://www.datalab.to/app/keys:
export DATALAB_API_KEY="your_api_key_here"
from datalab_sdk import DatalabClient
client = DatalabClient() # use env var from above, or pass api_key="your_api_key_here"
# Convert PDF to markdown
result = client.convert("document.pdf")
print(result.markdown)
# OCR a document
ocr_result = client.ocr("document.pdf")
print(ocr_result.pages) # Get all text as string
The SDK includes a command-line interface:
# Convert document to markdown
datalab convert document.pdf
# OCR with JSON output
datalab ocr document.pdf --output-format json
MIT License