Skip to main content

Process Images

Process a single image containing math, text, tables, or chemistry diagrams.

See the image processing guide for step-by-step examples.

POST v3/text

POST api.mathpix.com/v3/text

Accepts an image URL, base64-encoded image, or file upload (multipart form-data with options_json). Returns structured content as Mathpix Markdown (with LaTeX math inside \( ... \) and \[ ... \] delimiters), HTML, or extracted data formats. Chemistry diagrams are returned as <smiles>...</smiles> SMILES notation.

info
  • EXIF data is ignored for all images, including EXIF orientation
  • When sending an image via multipart form-data, pass all options as stringified JSON in a top-level options_json parameter
  • Request limits: 5 MB JSON body, 10 MB image download from URL, 2 MB base64-encoded image, 15 second URL download timeout — see Limits & Quotas

Example

curl -X POST https://api.mathpix.com/v3/text \
-H 'app_id: APP_ID' \
-H 'app_key: APP_KEY' \
-H 'Content-Type: application/json' \
--data '{"src": "https://mathpix-ocr-examples.s3.amazonaws.com/cases_hw.jpg", "math_inline_delimiters": ["$", "$"], "rm_spaces": true}'
Example response
{
"request_id": "14b53567-9f6c-4895-ab3d-e4a8ae18f9c1",
"text": "$f(x)=\\left\\{\\begin{array}{ll}x^{2} & \\text { if } x<0 \\\\ 2 x & \\text { if } x \\geq 0\\end{array}\\right.$",
"confidence": 1,
"confidence_rate": 1,
"is_printed": false,
"is_handwritten": true,
"image_height": 332,
"image_width": 850,
"version": "SuperNet-109p4"
}

Request parameters

src string (optional)

Image URL or base64-encoded image (e.g. data:image/jpeg;base64,...)

metadata object (optional)

Key-value object. Supports improve_mathpix for extra privacy controls.

tags string[] (optional)

Tags are lists of strings that can be used to identify results. see query image results

async bool[] (optional)

This flag is to be used when sending non-interactive requests

callback Callback (optional)

Webhook for asynchronous result delivery

formats string[] (optional)

List of formats, see Format Descriptions. Empty array or object returns the text format.

Values

text, data, html, latex_styled

data_options DataOptions (optional)

Specifies outputs for data and html return fields

include_detected_alphabets bool (optional)

Return detected alphabets

alphabets_allowed AlphabetsAllowed (optional)

Specify which alphabets you don't want in the output

region Region (optional)

Specify the image area with pixel coordinates. All four properties are required if region is provided. Empty object {} is valid (treated as no region)

enable_blue_hsv_filter bool (optional), default value is false

Enables a special mode of image processing where it OCRs only blue hue text.

confidence_threshold number in [0,1] (optional)

Specifies threshold for triggering math_confidence errors. Returns error when confidence is below this value

confidence_rate_threshold number in [0,1] (optional), default value is 0.75

Specifies threshold for triggering math_confidence errors at the symbol level.

include_equation_tags bool (optional)

Specifies whether to include equation number tags inside equations LaTeX. When set to true, it sets "idiomatic_eqn_arrays": true, because equation numbering works better in those environments compared to the array environment.

Example

\tag{eq_number}, where eq_number is an equation number (e.g. 1.12)

include_line_data bool (optional)

Specifies whether to return information segmented line by line, see LineData object section for details

include_word_data bool (optional)

Specifies whether to return information segmented word by word, see WordData object section for details

include_smiles bool (optional)

Enable experimental chemistry diagram OCR, via RDKIT normalized SMILES with isomericSmiles=False, included in text output format, via MMD SMILES syntax <smiles>...</smiles>

include_inchi bool (optional)

Include InChI data as XML attributes inside <smiles> elements. Only applies when include_smiles is true.

Example

<smiles inchi="..." inchikey="...">...</smiles>

include_diagram_text bool (optional), default value is false

Enables text extraction from diagrams (use with "include_line_data": true). The extracted text will be part of line data, and not part of the text, or any other output format specified. The parent_id of these text lines will correspond to the id of one of the diagrams in the line data. Also, diagram will have children_ids to store references to these text lines.

include_page_info bool (optional), default value is true

Controls whether page info elements are included in the final text output. Page info refers to elements like headers, footers, and page numbers that are not part of the main text.

auto_rotate_confidence_threshold number in [0,1] (optional), default value is 0.99

Specifies threshold for auto rotating image to correct orientation. Can be disabled with a value of 1 (see Auto rotation section for details).

rm_spaces bool (optional), default value is true

Determines whether extra white space is removed from equations in latex_styled and text formats.

rm_fonts bool (optional), default value is false

Determines whether font commands such as \mathbf and \mathrm are removed from equations in latex_styled and text formats.

idiomatic_eqn_arrays bool (optional), default value is false

Specifies whether to use aligned, gathered, or cases instead of an array environment for a list of equations.

idiomatic_braces bool (optional), default value is false

Specifies whether to remove unnecessary braces for LaTeX output.

Example

x^2 is returned instead of x^{2}

numbers_default_to_math bool (optional), default value is false

Specifies whether numbers are always math.

Example

Answer: \( 17 \) instead of Answer: 17

math_fonts_default_to_math bool (optional), default value is false

Specifies whether math fonts are always math.

Example

Answer: \( 2 \mathrm { ms } \) instead of Answer: 2 ms

math_inline_delimiters [string, string] (optional), default value is ["\\(", "\\)"]

Specifies begin inline math and end inline math delimiters for text outputs.

math_display_delimiters [string, string] (optional), default value is ["\\[", "\\]"]

Specifies begin display math and end display math delimiters for text outputs.

enable_spell_check bool (deprecated)

Deprecated, has no effect on the output.

enable_tables_fallback bool, default value is false

Enables advanced table processing algorithm that supports large and complex tables.

fullwidth_punctuation bool (optional), default value is null

Controls if punctuation will be fullwidth Unicode (default for east Asian languages like Chinese), or halfwidth Unicode (default for Latin scripts, Cyrillic scripts etc.). When null, fullwidth vs halfwidth will be decided based on image content. Punctuation inside math will always stay halfwidth.

Format descriptions

FormatDescription
textMathpix Markdown
htmlHTML rendered from text via mathpix-markdown-it
dataData computed from text as specified in the data_options request parameter
latex_styledStyled Latex, returned only in cases that the whole image can be reduced to a single equation

DataOptions object

Data options are used to return elements of the image output. These outputs are all computed from the text format described above. The data_options parameter must be an object — only the keys listed below are accepted. Unknown keys return opts_unknown_data_option.

include_svg bool (optional)

include math SVG in html and data formats

include_table_html bool (optional)

include HTML for html and data outputs (tables only)

include_latex bool (optional)

include math mode latex in data and html

include_tsv bool (optional)

include tab separated values (TSV) in data and html outputs (tables only)

include_asciimath bool (optional)

include asciimath in data and html outputs

include_mathml bool (optional)

include mathml in data and html outputs

include_sub_math bool (optional)

include sub-math elements in data and html outputs

Response body

request_id string

Request ID, for debugging purposes

text string (optional)

Recognized text format, if such is found. May contain custom Mathpix macros.

latex_styled string (optional)

Math LaTeX string of math equation, if the image is of a single equation. May contain custom Mathpix macros.

confidence number in [0,1] (optional)

Estimated probability 100% correct

confidence_rate number in [0,1] (optional)

Estimated confidence of output quality

line_data object[] (optional)

List of LineData objects

word_data object[] (optional)

List of WordData objects

data object[] (optional)

List of Data objects

html string (optional)

Annotated HTML output

detected_alphabets DetectedAlphabet[] (optional)

Detected alphabet flags

is_printed bool (optional)

Specifies if printed content was detected in an image

is_handwritten bool (optional)

Specifies if handwritten content was detected in an image

auto_rotate_confidence number in [0,1] (optional)

Estimated probability that image needs to be rotated, see Auto rotation

auto_rotate_degrees number in {0, 90, -90, 180} (optional)

Estimated angle of rotation in degrees to put image in correct orientation, see Auto rotation

image_height integer (optional)

Height of the processed image in pixels

image_width integer (optional)

Width of the processed image in pixels

error string (optional)

US locale error message

error_info ErrorInfo (optional)

Error info object

version string

This string is opaque to clients and only useful as a way of understanding differences in results for requests using the same image. Our service relies on training data, the service implementation, and the underlying platforms we run on (e.g., AWS, PyTorch). Initially, the version string will only change when the training data or process changes, but in the future we might provide more distinctions between versions.

Type definitions

Data object

Data objects allow extracting the math elements from an OCR result.

type string

one of asciimath, mathml, latex, svg, tsv

value string

value corresponding to type

LineData object

Returned when include_line_data is set to true. Contains information about all textual line elements detected in the image. Concatenating content from line_data recreates the top-level text, html, and data fields.

The OCR engine does not support some lines (like diagrams), which are skipped. Lines with extraneous content (like equation numbers) or low confidence have conversion_output set to false.

id string

Unique line identifier

parent_id string (optional)

Unique line identifier of the parent.

children_ids string[] (optional)

List of children unique identifiers.

type string

See line types and subtypes for details.

subtype string (optional)

See line types and subtypes for details.

cnt [[x,y]]

Contour for line expressed as list of (x,y) pixel coordinate pairs

included bool

Whether this line is included in the top level OCR result (deprecated, use conversion_output)

conversion_output boolean

Whether this line is included in the top level OCR result

is_printed bool

True if line has printed text, false otherwise.

is_handwritten bool

True if line has handwritten text, false otherwise.

error_id string (optional)

Error ID, reason why the line is not included in final result

text string (optional)

Text (Mathpix Markdown) for line

confidence number in [0,1] (optional)

Estimated probability 100% correct

confidence_rate number in [0,1] (optional)

Estimated confidence of output quality

after_hyphen bool (optional)

specifies if the current line occurs after the text line which ended with hyphen

html string (optional)

Annotated HTML output for the line

data Data[] (optional)

List of Data object's

Possible values for error_id:

  • image_not_supported — OCR engine doesn't accept this line format
  • image_max_size — line is larger than maximal size which OCR engine supports
  • math_confidence — OCR engine failed to confidently recognize the content of the line
  • image_no_content — line has strange spatial dimensions, e.g. height of the line is zero

Line data types and subtypes

Types and subtypes returned as part of line data and PDF lines data (types are the keys, subtypes are values):

{
"chart_info": [],
"x_axis_tick_label": [],
"y_axis_tick_label": [],
"x_axis_label": [],
"y_axis_label": [],
"legend_label": [],
"model_label": [],
"page_info": [],
"equation_number": [],
"table": [],
"diagram": [
"algorithm",
"pseudocode",
"chemistry",
"chemistry_reaction",
"triangle"
],
"chart": [
"column",
"bar",
"line",
"analytical",
"pie",
"scatter",
"area"
],
"diagram_info": [],
"text": [
"vertical",
"big_capital_letter"
],
"math": [],
"column": [],
"code": [],
"pseudocode": [],
"figure_label": [],
"form_field": [
"parentheses",
"dotted",
"dashed",
"box",
"checkbox",
"circle"
],
"qed_symbol": [],
"multiple_choice_block": [],
"multiple_choice_option": [],
"footnote": [],
"table_of_contents_container": [],
"table_of_contents_row": [],
"table_of_contents_item": [],
"table_of_contents_number": [],
"title": [],
"quote": [],
"section_header": [],
"authors": [],
"abstract": [],
"rotated_container": [],
"table_cell": [
"split",
"spanning"
]
}

WordData object

Returned when include_word_data is set to true. Contains information about all word-level elements detected in the image.

type string

One of text, math, table, diagram, equation_number

subtype string (optional)

Either not set, or chemistry, or triangle (more diagram subtypes coming soon)

cnt [[x,y]]

Contour for word expressed as list of (x,y) pixel coordinate pairs

text string (optional)

Text (Mathpix Markdown) for word

latex string (optional)

Math mode LaTeX (Mathpix Markdown) for word

confidence number in [0,1] (optional)

Estimated probability 100% correct

confidence_rate number in [0,1] (optional)

Estimated confidence of output quality

Auto rotation

The auto rotation feature detects when images are in the wrong orientation and corrects them before processing.

Control the confidence threshold with the auto_rotate_confidence_threshold request parameter (number in [0,1]). Default is 0.99, meaning the image is rotated only when the algorithm is 99% confident. Set to 1 to disable auto rotation.

The response includes:

  • auto_rotate_confidence — confidence that the image needs rotation (number in [0,1], ~0 if correct, ~1 if rotated)
  • auto_rotate_degrees — rotation angle applied (one of 0, 90, -90, 180)