The official Python client for the Mathpix API. Process PDFs and images, and convert math/text content with the Mathpix API.
pip install mpxpy
You'll need a Mathpix API app_id and app_key to use this client. You can get these from Mathpix Console.
Set your credentials by either:
- Using environment variables
- Passing them directly when initializing the client
MathpixClient will prioritize auth configs in the following order:
- Passed through arguments
- The
~/.mpx/config
file - ENV vars located in
.env
- ENV vars located in
local.env
Create a config file at ~/.mpx/config
or add ENV variables to .env
or local.env
files:
MATHPIX_APP_ID=your-app-id
MATHPIX_APP_KEY=your-app-key
MATHPIX_URL=https://api.mathpix.com # optional, defaults to this value
Then initialize the client:
from mpxpy.mathpix_client import MathpixClient
# Will use ~/.mpx/config or environment variables
client = MathpixClient()
You can also pass in your App ID and App Key when initializing the client:
from mpxpy.mathpix_client import MathpixClient
client = MathpixClient(
app_id="your-app-id",
app_key="your-app-key"
# Optional "api_url" argument sets the base URL. This can be useful for development with on-premise deployments
)
from mpxpy.mathpix_client import MathpixClient
client = MathpixClient(
app_id="your-app-id",
app_key="your-app-key"
)
# Process a PDF file with multiple conversion formats and options
pdf = client.pdf_new(
url="http://cs229.stanford.edu/notes2020spring/cs229-notes1.pdf",
convert_to_docx=True,
convert_to_md=True,
)
# Wait for processing to complete. Optional timeout argument is 60 seconds by default.
pdf.wait_until_complete(timeout=30)
# Get the Markdown outputs
md_output_path = pdf.to_md_file(path='output/sample.md')
md_text = pdf.to_md_text() # is type str
print(md_text)
# Get the DOCX outputs
docx_output_path = pdf.to_docx_file(path='output/sample.docx')
docx_bytes = pdf.to_docx_bytes() # is type bytes
# Get the JSON outputs
lines_json_output_path = pdf.to_lines_json_file(path='output/sample.lines.json')
lines_json = pdf.to_lines_json() # parses JSON into type Dict
auth
: An Auth instance with Mathpix credentials.pdf_id
: The unique identifier for this PDF.file_path
: Path to a local PDF file.url
: URL of a remote PDF file.convert_to_docx
: Optional boolean to automatically convert your result to docxconvert_to_md
: Optional boolean to automatically convert your result to mdconvert_to_mmd
: Optional boolean to automatically convert your result to mmdconvert_to_tex_zip
: Optional boolean to automatically convert your result to tex.zipconvert_to_html
: Optional boolean to automatically convert your result to htmlconvert_to_pdf
: Optional boolean to automatically convert your result to pdf
wait_until_complete
: Wait for the PDF processing and optional conversions to complete.pdf_status
: Get the current status of the PDF processing.pdf_conversion_status
: Get the current status of the PDF conversions.to_docx_file
: Save the processed PDF result to a DOCX file at a local path.to_docx_bytes
: Get the processed PDF result as DOCX bytes.to_md_file
: Save the processed PDF result to a Markdown file at a local path.to_md_text
: Get the processed PDF result as a Markdown string.to_mmd_file
: Save the processed PDF result to a Mathpix Markdown file at a local path.to_mmd_text
: Get the processed PDF result as a Mathpix Markdown string.to_tex_zip_file
: Save the processed PDF result to a tex.zip file at a local path.to_tex_zip_bytes
: Get the processed PDF result in tex.zip format as bytes.to_html_file
: Save the processed PDF result to a HTML file at a local path.to_html_bytes
: Get the processed PDF result in HTML format as bytes.to_pdf_file
: Save the processed PDF result to a PDF file at a local path.to_pdf_bytes
: Get the processed PDF result in PDF format as bytes.to_lines_json_file
: Save the processed PDF line-by-line result to a JSON file at a local path.to_lines_json
: Get the processed PDF result in JSON format.to_lines_mmd_json_file
: Save the processed PDF line-by-line result, including Mathpix Markdown, to a JSON file at a local path.to_lines_mmd_json
: Get the processed PDF result in JSON format with text in Mathpix Markdown.
from mpxpy.mathpix_client import MathpixClient
client = MathpixClient(
app_id="your-app-id",
app_key="your-app-key"
)
# Process an image file
image = client.image_new(
url="https://mathpix-ocr-examples.s3.amazonaws.com/cases_hw.jpg"
)
# Get the Mathpix Markdown (MMD) representation
mmd = image.mmd()
print(mmd)
# Get line-by-line OCR data
lines = image.lines_json()
print(lines)
auth
: An Auth instance with Mathpix credentials.file_path
: Path to a local image file, if using a local file.url
: URL of a remote image, if using a remote file.
lines_json
: Get line-by-line OCR data for the image.mmd
: Get the Mathpix Markdown (MMD) representation of the image.
from mpxpy.mathpix_client import MathpixClient
client = MathpixClient(
app_id="your-app-id",
app_key="your-app-key"
)
# Similar to Pdf, Conversion class takes separate arguments for each conversion format
conversion = client.conversion_new(
mmd="\\frac{1}{2}",
convert_to_docx=True,
convert_to_md=True,
)
# Wait for conversion to complete
conversion.wait_until_complete(timeout=30)
# Get the Markdown outputs
md_output_path = conversion.to_md_file(path='output/sample.md')
md_text = conversion.to_md_text() # is of type str
# Get the DOCX outputs
docx_output_path = conversion.to_docx_file(path='output/sample.docx')
docx_bytes = conversion.to_docx_bytes() # is of type bytes
auth
: An Auth instance with Mathpix credentials.conversion_id
: The unique identifier for this conversion.convert_to_docx
: Optional boolean to automatically convert your result to docxconvert_to_md
: Optional boolean to automatically convert your result to mdconvert_to_tex_zip
: Optional boolean to automatically convert your result to tex.zipconvert_to_html
: Optional boolean to automatically convert your result to htmlconvert_to_pdf
: Optional boolean to automatically convert your result to pdfconvert_to_latex_pdf
: Optional boolean to automatically convert your result to pdf containing LaTeX
wait_until_complete
: Wait for the conversion to complete.conversion_status
: Get the current status of the conversion.to_docx_file
: Save the processed conversion result to a DOCX file at a local path.to_docx_bytes
: Get the processed conversion result as DOCX bytes.to_md_file
: Save the processed conversion result to a Markdown file at a local path.to_md_text
: Get the processed conversion result as a Markdown string.to_mmd_file
: Save the processed conversion result to a Mathpix Markdown file at a local path.to_mmd_text
: Get the processed conversion result as a Mathpix Markdown string.to_tex_zip_file
: Save the processed conversion result to a tex.zip file at a local path.to_tex_zip_bytes
: Get the processed conversion result in tex.zip format as bytes.to_html_file
: Save the processed conversion result to a HTML file at a local path.to_html_bytes
: Get the processed conversion result in HTML format as bytes.to_pdf_file
: Save the processed conversion result to a PDF file at a local path.to_pdf_bytes
: Get the processed conversion result in PDF format as bytes.to_latex_pdf_file
: Save the processed conversion result to a PDF file containing LaTeX at a local path.to_latex_pdf_bytes
: Get the processed conversion result in PDF format as bytes (with LaTeX).
The client provides detailed error information in the following classes:
- MathpixClientError
- AuthenticationError
- ValidationError
- FilesystemError
- ConversionIncompleteError
from mpxpy.mathpix_client import MathpixClient
from mpxpy.errors import MathpixClientError, ConversionIncompleteError
client = MathpixClient(app_id="your-app-id", app_key="your-app-key")
try:
pdf = client.pdf_new(file_path="example.pdf", convert_to_docx=True)
except FileNotFoundError as e:
print(f"File not found: {e}")
except MathpixClientError as e:
print(f"File upload error: {e}")
try:
pdf.to_docx_file('output/path/example.pdf')
except ConversionIncompleteError as e:
print(f'Conversions are not complete')
# Clone the repository
git clone git@github.com:Mathpix/mpxpy.git
cd mpxpy
# Install in development mode
pip install -e .
# Or install using the requirements.txt file
pip install -r requirements.txt
# Install test dependencies
pip install -e ".[dev]"
# Or install using the requirements.txt file
pip install -r requirements.txt
# Run tests
pytest