Skip to main content

Files API Quickstart

Submit a real batch in five minutes. By the end of this guide you'll have submitted a single document, polled its status, downloaded its converted result, and submitted a two-item batch — using only your API key and curl (or Python / Node).

Prerequisites

  • An API key. Grab one from the Mathpix Console and export it as APP_KEY in your shell.
  • For s3://, gs://, or Azure Blob URLs, a registered data source for the bucket. Public https:// URLs work without any setup.
export APP_KEY="your-app-key"

1. Submit a single document

Use POST /files/v1/uri to submit one document by URL.

curl -X POST https://api.mathpix.com/files/v1/uri \
-H "app_key: $APP_KEY" \
-H 'Content-Type: application/json' \
--data '{
"source_uri": "https://cdn.mathpix.com/examples/cs229-notes1.pdf",
"conversion_formats": { "docx": true, "md": true }
}'
Response
{ "file_id": "b1c9c3a8-55e4-4a09-b7d0-218ba5de4c4d" }

Keep the returned file_id — it's how you'll check status and download results.

2. Check status

Poll GET /files/v1/{file_id} until status == "completed" (or "error"). A typical document moves through these states:

// Just submitted
{ "file_id": "b1c9c3a8-55e4-4a09-b7d0-218ba5de4c4d", "status": "pending", "percent_done": 0 }
// Pages extracted, OCR in progress
{ "file_id": "b1c9c3a8-55e4-4a09-b7d0-218ba5de4c4d", "status": "split", "percent_done": 60.0 }
// Done — outputs available
{ "file_id": "b1c9c3a8-55e4-4a09-b7d0-218ba5de4c4d", "status": "completed", "percent_done": 100.0 }
curl -H "app_key: $APP_KEY" \
"https://api.mathpix.com/files/v1/$FILE_ID"

3. Download the result

Once status is completed, request results by extension via GET /files/v1/{file_id}.{ext}.

# Mathpix Markdown — always available
curl -H "app_key: $APP_KEY" \
"https://api.mathpix.com/files/v1/$FILE_ID.mmd" \
-o result.mmd

# DOCX — only if you asked for it on submission
curl -H "app_key: $APP_KEY" \
"https://api.mathpix.com/files/v1/$FILE_ID.docx" \
-o result.docx

4. Submit many at once

For batches, use POST /files/v1/jobs instead — up to 200,000 files in a single request. Pass an array of source URIs plus job-wide conversion/OCR options applied to every file. Each file can carry an optional custom_id for your own correlation. job_id is optional — the server generates one if you omit it — but you must supply your own when you use custom_id (it's a per-job identifier), as the example below does.

curl -X POST https://api.mathpix.com/files/v1/jobs \
-H "app_key: $APP_KEY" \
-H 'Content-Type: application/json' \
--data '{
"job_id": "quickstart-batch",
"files": [
{ "source_uri": "https://cdn.mathpix.com/examples/cs229-notes1.pdf", "custom_id": "cs229" },
{ "source_uri": "https://example.com/manual.pdf", "custom_id": "manual" }
],
"conversion_formats": { "docx": true, "md": true }
}'
Response
{
"job_id": "quickstart-batch",
"file_count": 2
}

5. Track the job

Poll GET /files/v1/jobs/{job_id} for status and counters, then list per-file results (with optional status= filter) via GET /files/v1/jobs/{job_id}/files.

# Job status + counters
curl -H "app_key: $APP_KEY" \
"https://api.mathpix.com/files/v1/jobs/$JOB_ID"

# Per-file listing — paginate with paging_state from each response
curl -H "app_key: $APP_KEY" \
"https://api.mathpix.com/files/v1/jobs/$JOB_ID/files"

# Only the errored files
curl -H "app_key: $APP_KEY" \
"https://api.mathpix.com/files/v1/jobs/$JOB_ID/files?status=error"
Example job-status response
{
"job_id": "quickstart-batch",
"status": "completed",
"file_count": 2,
"files_completed": 2,
"files_errored": 0,
"created_at": "2026-05-28T12:00:00Z",
"modified_at": "2026-05-28T12:04:11Z"
}

Once the job is completed, download per-file outputs the same way as Step 3 — GET /files/v1/{file_id}.{ext} for each file_id returned by the listing.

Where to go next