Async Document Lifecycle
Once a file has been submitted via POST /files/v1/uri, POST /files/v1/jobs, or POST /files/v1 (direct multipart upload), use these endpoints to poll its status, download its converted results, or delete it.
| Endpoint | Description |
|---|---|
| GET /files/v1/{file_id} | Poll processing status |
| GET /files/v1/{file_id}.{ext} | Download a converted result in the requested format |
| DELETE /files/v1/{file_id} | Permanently remove a file and its results |
GET /files/v1/{file_id}
GET api.mathpix.com/files/v1/{file_id}
Returns the file's status and processing progress. Poll until status == "completed" (or "error").
Example
- cURL
- Python
curl -H 'app_key: APP_KEY' \
https://api.mathpix.com/files/v1/b1c9c3a8-55e4-4a09-b7d0-218ba5de4c4d
import requests
r = requests.get(
f"https://api.mathpix.com/files/v1/{file_id}",
headers={"app_key": "APP_KEY"},
)
body = r.json()
print(body["status"], body.get("percent_done"))
// Response 200
{
"file_id": "b1c9c3a8-55e4-4a09-b7d0-218ba5de4c4d",
"status": "split", // "pending" | "split" | "completed" | "error"
"filename": "input.pdf",
"custom_id": "contract-001", // echoed back when supplied at submit; null otherwise
"num_pages": 50,
"num_pages_completed": 25,
"percent_done": 50.0,
"format_primary": "mmd",
"formats": { // per-requested-format conversion status
"md": "completed",
"docx": "processing"
}
}
Status values
| Status | Meaning |
|---|---|
pending | File registered, queued for processing. |
split | Pages extracted, OCR and conversion in progress (poll percent_done). |
completed | All processing finished — results available via download. |
error | Processing failed — see the error fields on the response (below). |
Per-format conversion status
format_primary is always mmd. The formats map carries one entry per format you requested via conversion_formats, each with its own conversion status (pending / processing / completed / error). Conversions complete independently of — and can lag behind — the top-level status: a file can be completed overall while an individual format is still processing. Poll formats.{ext} before downloading that extension; a download of a format that isn't yet completed returns 404 format_not_ready.
Error fields
When status is "error", the response carries the same error + error_info object used by Files API request errors, instead of progress:
// Response 200 (status == "error")
{
"file_id": "b1c9c3a8-55e4-4a09-b7d0-218ba5de4c4d",
"status": "error",
"error": "data_source_access_denied", // machine-readable code
"error_info": {
"id": "data_source_access_denied", // duplicates `error` (v3-parser compatibility)
"message": "Access denied to source" // human-readable detail
},
"filename": "input.pdf",
"num_pages": 0,
"percent_done": 0.0
}
error is a stable, machine-readable code; see the error reference for the full list. Because remote-source fetching happens asynchronously, source problems surface here (not on the original submit call) — common values include data_source_access_denied (the bucket's grant isn't set up), data_source_not_found (no data source registered for the bucket), and content_too_large.
GET /files/v1/{file_id}.{ext}
GET api.mathpix.com/files/v1/{file_id}.{ext}
Download a converted result. The MMD format is always produced; other formats produce only when requested via conversion_formats on the original submission.
Supported extensions
mmd, md, md.zip, mmd.zip, docx, pptx, html, html.zip, tex.zip, latex.pdf, pdf, lines.json, lines.mmd.json.
See Supported formats for the full list with descriptions.
Example
- cURL
- Python
curl -H 'app_key: APP_KEY' \
-o output.docx \
https://api.mathpix.com/files/v1/b1c9c3a8-55e4-4a09-b7d0-218ba5de4c4d.docx
import requests
r = requests.get(
f"https://api.mathpix.com/files/v1/{file_id}.docx",
headers={"app_key": "APP_KEY"},
)
if r.status_code == 200:
with open("output.docx", "wb") as f:
f.write(r.content)
elif r.status_code == 404 and r.json().get("error") == "format_not_ready":
print("Format still converting; retry later.")
Response headers
Content-Type: <MIME type for the requested extension>
Content-Disposition: attachment; filename="<basename>.<ext>"
Errors
| Code | HTTP | When it fires |
|---|---|---|
format_not_ready | 404 | Format is still converting (formats.{ext} is pending or processing). Retry after a short delay. |
unsupported_format | 415 | Extension wasn't requested via conversion_formats on the original submission, or isn't a supported output format. |
not_found | 404 | file_id doesn't exist (or was deleted). |
lines.json and lines.mmd.json are available once the primary mmd format completes.
DELETE /files/v1/{file_id}
DELETE api.mathpix.com/files/v1/{file_id}
Permanently remove a file and its results from Mathpix-owned storage. Files are auto-deleted on a per-artifact schedule (source and page images after 30 days, text outputs after 90 days — see Data retention); call this to remove sooner.
Example
curl -X DELETE -H 'app_key: APP_KEY' \
https://api.mathpix.com/files/v1/b1c9c3a8-55e4-4a09-b7d0-218ba5de4c4d
// Response 200
{
"file_id": "b1c9c3a8-55e4-4a09-b7d0-218ba5de4c4d",
"status": "deleted"
}
Behavior
- Only terminal files can be deleted. A file still being processed (
pending/split) cannot be deleted; DELETE returns409 conflict. Wait forcompletedorerror, then delete. - Idempotent. Calling DELETE on an already-deleted file returns the same
200 / status: deletedbody, not404. - Mathpix-owned storage only. Results delivered to a customer-owned bucket via
destination_uriare not affected — those live under your bucket's own lifecycle policy. Mathpix never deletes from customer-owned buckets. - Billing counters preserved. Per-month page and file counts that drive billing are never decremented. Deleting a file does not credit your account.
- Job counters preserved. A file's job remains intact;
file_count/files_completed/files_erroredon the parent job are not adjusted.
Errors
| Code | HTTP | When it fires |
|---|---|---|
not_found | 404 | file_id doesn't resolve to any row (and has never existed). |
conflict | 409 | file_id exists but is still processing (pending / split) — not yet deletable. |
forbidden | 403 | file_id exists but is owned by a different group. |
See also
POST /files/v1/uri— submit a single file.POST /files/v1/jobs— submit a batch.- Data retention — automatic cleanup schedule per artifact type.
- Supported formats — full list of conversion outputs.