Skip to main content

Migrate from SCS to Files API

Files API provides the same processing and output options as SCS, but you submit jobs yourself via API instead of through Mathpix support.

Prerequisites

  • A Mathpix app key. All requests authenticate with an app_key header (see Authentication). If you don't have one yet, sign up at the Mathpix Console and copy a key from the API keys page. SCS customers migrating over will typically be setting this up for the first time.
  • One-time data source setup. Required to enable private URI inputs and automatic result uploads straight to your cloud storage. Data sources are registered per group (your organization), not per app key — so every app key in your group shares the same registered sources, and you only set each bucket up once.

What's changing

SCS is operated manually by Mathpix. A customer would send a list of source and destination paths and enable storage access, and Mathpix would run the processing job and upload your results. The Files API exposes the same machinery as a public API:

SCSFiles API
Who runs the jobMathpix engineer (via internal CLI)Customer (via HTTP request)
OnboardingEmail support, exchange S3 credentials, scheduled by engineerSelf-serve: register a data source once, then submit any time
Concurrent jobsPer engineer availabilitySubmit on-demand, multiple jobs in flight
PricingCustom per contractPublic tiered pricing
ModelSame Mathpix OCR modelSame Mathpix OCR model

What you keep doing

  • Same processing model. OCR quality, layout extraction, equation/figure crops, and Mathpix Markdown output are unchanged from SCS.
  • For each input file you get an .mmd, .lines.json, and .lines.mmd.json, plus the option to include mmd conversions and a local images/ folder of cropped images and equations, referenced relatively from the MMD. Drop-in compatible with whatever consumes your SCS output today.
  • Same conversion formats — request docx, tex.zip, html, etc. via conversion_formats on the submission, just like before.
  • Same delivery model. Results land in your own bucket via destination_uri; Mathpix never holds long-term copies of your outputs.

What changes

  • Self-serve data sources. No more emailing credentials. Grant Mathpix access to your bucket via IAM role (AWS), AD app (Azure), or service-account impersonation (GCS) — once. See the Data Sources API.
  • Tiered pricing. Public, marginal-cost tiers applied per calendar month. See pricing.
  • Idempotency via custom_id. Supply a per-file custom_id on submission; resubmitting the same (job_id, custom_id) returns the original file_id instead of creating a duplicate. Safe to retry on network blips and timeouts.
  • Crop delivery is opt-in via image_output_mode. This setting controls where the cropped images (equations, figures, tables) go. To match the classic SCS output shape, set "image_output_mode": "local": the worker writes a loose images/ folder of crops into your destination_uri bucket, the MMD references them by relative path (images/<id>.jpg), and Mathpix keeps no long-term copy. Omit it (the default) and crops are hosted on Mathpix's CDN and referenced by absolute URL, with nothing written to your bucket.
    • Zipped and rich formats are self-contained either way. Per-format .zip variants embed their crops in the archive, and rich conversions (docx, pdf, pptx, …) embed the real crop images inside the file regardless of image_output_mode. local mode only matters when you want the standalone images/ folder delivered alongside non-zipped outputs like plain .mmd or .md.

A worked migration

A migrated workload typically walks your storage source to enumerate the input files, then submits them as a single job, giving each file its own destination_uri under your output prefix. destination_uri is per file (not job-wide), so a per-document subfolder keeps each document's results and crops together — the same layout SCS classic produced. The job_id you supply becomes the handle you'll use for status reads:

curl -X POST https://api.mathpix.com/files/v1/jobs \
-H "app_key: $APP_KEY" \
-H 'Content-Type: application/json' \
--data '{
"job_id": "2026-05-contracts",
"image_output_mode": "local",
"conversion_formats": { "docx": true, "md": true },
"files": [
{ "source_uri": "s3://acme-source/contracts/2026-05/contract-001.pdf", "custom_id": "contract-001", "destination_uri": "s3://acme-output/processed/2026-05/contract-001/" },
{ "source_uri": "s3://acme-source/contracts/2026-05/contract-002.pdf", "custom_id": "contract-002", "destination_uri": "s3://acme-output/processed/2026-05/contract-002/" }
]
}'
Response
{
"job_id": "2026-05-contracts",
"file_count": 2
}

Track the job using the same id you supplied as job_id:

# Status + counters
curl -H "app_key: $APP_KEY" \
"https://api.mathpix.com/files/v1/jobs/2026-05-contracts"

# Errored files only
curl -H "app_key: $APP_KEY" \
"https://api.mathpix.com/files/v1/jobs/2026-05-contracts/files?status=error"

Results land in s3://acme-output/processed/2026-05/<file_id>.mmd (plus images/, .docx, etc. per the formats you requested) — same layout SCS produced. The "image_output_mode": "local" option is what writes cropped images into your bucket's images/ folder with relative references; omit it and crops stay on Mathpix's CDN instead.

Support

  • Questions during migration: email support@mathpix.com with subject [SCS migration].
  • Your existing SCS contract terms carry over until your migration is complete — reach out before submitting volume that would exceed your previous SCS allotment.
  • Stuck on data-source setup? The Data Sources API reference covers the per-provider IAM steps. Send your AWS account ID / Azure tenant / GCS project to support if you'd like a setup review.

Next steps