POST v3/text
POST api.mathpix.com/v3/text
Process an image containing math, text, tables, or chemistry diagrams. Accepts an image URL, base64-encoded image, or file upload (multipart form-data with options_json).
Returns structured content as Mathpix Markdown (with LaTeX math inside \( ... \) and \[ ... \] delimiters), HTML, or extracted data formats. Chemistry diagrams are returned as <smiles>...</smiles> SMILES notation.
See the image processing guide for step-by-step examples.
Mathpix ignores all EXIF data for all images, particularly EXIF orientation.
When sending an image file via multipart form-data, all options are sent as stringified JSON in a top-level options_json parameter.
Request limits: 5 MB JSON body, 10 MB image download from URL, 2 MB base64-encoded image, 15 second URL download timeout. See Limits & Quotas for details.
Request parameters
src Image URL or base64-encoded image (e.g. data:image/jpeg;base64,...)
metadata Key-value object
tags Tags are lists of strings that can be used to identify results. see query image results
async This flag is to be used when sending non-interactive requests
callback formats List of formats, see Format Descriptions. Empty array or object returns the text format.
Values
text, data, html, latex_styled
data_options See DataOptions section, specifies outputs for data and html return fields
include_detected_alphabets Return detected alphabets
alphabets_allowed See AlphabetsAllowed section, use this to specify which alphabets you don't want in the output
region Specify the image area with the pixel coordinates top_left_x, top_left_y, width, and height. All four properties are required if region is provided. Empty object {} is valid (treated as no region)
enable_blue_hsv_filter Enables a special mode of image processing where it OCRs blue hue text exclusively.
confidence_threshold Specifies threshold for triggering math_confidence errors. Returns error when confidence is below this value
confidence_rate_threshold Specifies threshold for triggering math_confidence errors at the symbol level.
include_equation_tags Specifies whether to include equation number tags inside equations LaTeX. When set to true, it sets "idiomatic_eqn_arrays": true, because equation numbering works better in those environments compared to the array environment.
Example
\tag{eq_number}, where eq_number is an equation number (e.g. 1.12)
include_line_data Specifies whether to return information segmented line by line, see LineData object section for details
include_word_data Specifies whether to return information segmented word by word, see WordData object section for details
include_smiles Enable experimental chemistry diagram OCR, via RDKIT normalized SMILES with isomericSmiles=False, included in text output format, via MMD SMILES syntax <smiles>...</smiles>
include_inchi Include InChI data as XML attributes inside <smiles> elements. Only applies when include_smiles is true.
Example
<smiles inchi="..." inchikey="...">...</smiles>
include_diagram_text Enables text extraction from diagrams (use with "include_line_data": true). The extracted text will be part of line data, and not part of the text, or any other output format specified. The parent_id of these text lines will correspond to the id of one of the diagrams in the line data. Also, diagram will have children_ids to store references to these text lines.
include_page_info Controls whether page info elements are included in the final text output. Page info refers to various elements that are not part of the main text.
auto_rotate_confidence_threshold Specifies threshold for auto rotating image to correct orientation. Can be disabled with a value of 1 (see Auto rotation section for details).
rm_spaces Determines whether extra white space is removed from equations in latex_styled and text formats.
rm_fonts Determines whether font commands such as \mathbf and \mathrm are removed from equations in latex_styled and text formats.
idiomatic_eqn_arrays Specifies whether to use aligned, gathered, or cases instead of an array environment for a list of equations.
idiomatic_braces Specifies whether to remove unnecessary braces for LaTeX output.
Example
x^2 is returned instead of x^{2}
numbers_default_to_math math_fonts_default_to_math Specifies whether math fonts are always math.
Example
Answer: \( 2 \mathrm { ms } \) instead of Answer: 2 ms
math_inline_delimiters Specifies begin inline math and end inline math delimiters for text outputs.
math_display_delimiters Specifies begin display math and end display math delimiters for text outputs.
enable_spell_check Deprecated, has no effect on the output.
enable_tables_fallback Enables advanced table processing algorithm that supports very large and complex tables.
fullwidth_punctuation Controls if punctuation will be fullwidth Unicode (default for east Asian languages like Chinese), or halfwidth Unicode (default for Latin scripts, Cyrillic scripts etc.). When null, fullwidth vs halfwidth will be decided based on image content. Punctuation inside math will always stay halfwidth.
Format descriptions
| Format | Description |
|---|---|
| text | Mathpix Markdown |
| html | HTML rendered from text via mathpix-markdown-it |
| data | Data computed from text as specified in the data_options request parameter |
| latex_styled | Styled Latex, returned only in cases that the whole image can be reduced to a single equation |
DataOptions object
Data options are used to return elements of the image output. These outputs are all computed from the text format described above. The data_options parameter must be an object — only the keys listed below are accepted. Unknown keys return opts_unknown_data_option.
include_svg include math SVG in html and data formats
include_table_html include HTML for html and data outputs (tables only)
include_latex include math mode latex in data and html
include_tsv include tab separated values (TSV) in data and html outputs (tables only)
include_asciimath include asciimath in data and html outputs
include_mathml include mathml in data and html outputs
include_sub_math include sub-math elements in data and html outputs
Response body
request_id Request ID, for debugging purposes
text Recognized text format, if such is found
latex_styled Math Latex string of math equation, if the image is of a single equation
confidence Estimated probability 100% correct
confidence_rate Estimated confidence of output quality
line_data List of LineData objects
word_data List of WordData objects
data List of Data objects
html Annotated HTML output
detected_alphabets DetectedAlphabet object
is_printed Specifies if printed content was detected in an image
is_handwritten Specifies if handwritten content was detected in an image
auto_rotate_confidence Estimated probability that image needs to be rotated, see Auto rotation
auto_rotate_degrees Estimated angle of rotation in degrees to put image in correct orientation, see Auto rotation
error US locale error message
error_info Error info object
version This string is opaque to clients and only useful as a way of understanding differences in results for requests using the same image. Our service relies on training data, the service implementation, and the underlying platforms we run on (e.g., AWS, PyTorch). Initially, the version string will only change when the training data or process changes, but in the future we might provide additional distinctions between versions.
Data object
Data objects allow extracting the math elements from an OCR result.
type one of asciimath, mathml, latex, svg, tsv
value value corresponding to type
DetectedAlphabet object
Each field is a boolean that is true if any characters from that alphabet are recognized in the image, regardless of whether the result fields contain those characters.
en— Englishhi— Hindi Devanagarizh— Chineseja— Kana Hiragana or Katakanako— Hangul Jamoru— Russianth— Thaita— Tamilte— Telugugu— Gujaratibn— Bengalivi— Vietnamese
AlphabetsAllowed object
A map from alphabet key to boolean that controls which alphabets are allowed in the output. This is useful when different alphabets contain look-alike characters (e.g. Latin B vs Cyrillic В) that can cause incorrect Unicode encodings in the result.
- Keys correspond to the alphabet codes listed in DetectedAlphabet (e.g.
hi,ru) - By default all alphabets are allowed
- Set a key to
falseto suppress that alphabet in the output - Setting a key to
truehas the same effect as omitting it
Example
{"alphabets_allowed": {"ru": false, "hi": false}}
LineData object
Returned when include_line_data is set to true. Contains information about all textual line elements detected in the image. Concatenating content from line_data recreates the top-level text, html, and data fields.
The OCR engine does not support some lines (like diagrams), which are skipped. Lines with extraneous content (like equation numbers) or low confidence have conversion_output set to false.
id Unique line identifier
parent_id Unique line identifier of the parent.
children_ids List of children unique identifiers.
type See line types and subtypes for details.
subtype See line types and subtypes for details.
cnt Contour for line expressed as list of (x,y) pixel coordinate pairs
included Whether this line is included in the top level OCR result (deprecated, use conversion_output)
conversion_output Whether this line is included in the top level OCR result
is_printed True if line has printed text, false otherwise.
is_handwritten True if line has handwritten text, false otherwise.
error_id Error ID, reason why the line is not included in final result
text Text (Mathpix Markdown) for line
confidence Estimated probability 100% correct
confidence_rate Estimated confidence of output quality
after_hyphen specifies if the current line occurs after the text line which ended with hyphen
html Annotated HTML output for the line
data List of Data object's
Possible values for error_id:
image_not_supported— OCR engine doesn't accept this type of lineimage_max_size— line is larger than maximal size which OCR engine supportsmath_confidence— OCR engine failed to confidently recognize the content of the lineimage_no_content— line has strange spatial dimensions, e.g. height of the line is zero
Line data types and subtypes
Types and subtypes returned as part of line data and PDF lines data (types are the keys, subtypes are values):
{
"chart_info": [],
"x_axis_tick_label": [],
"y_axis_tick_label": [],
"x_axis_label": [],
"y_axis_label": [],
"legend_label": [],
"model_label": [],
"page_info": [],
"equation_number": [],
"table": [],
"diagram": [
"algorithm",
"pseudocode",
"chemistry",
"chemistry_reaction",
"triangle"
],
"chart": [
"column",
"bar",
"line",
"analytical",
"pie",
"scatter",
"area"
],
"diagram_info": [],
"text": [
"vertical",
"big_capital_letter"
],
"math": [],
"column": [],
"code": [],
"pseudocode": [],
"figure_label": [],
"form_field": [
"parentheses",
"dotted",
"dashed",
"box",
"checkbox",
"circle"
],
"qed_symbol": [],
"multiple_choice_block": [],
"multiple_choice_option": [],
"footnote": [],
"table_of_contents_container": [],
"table_of_contents_row": [],
"table_of_contents_item": [],
"table_of_contents_number": [],
"title": [],
"quote": [],
"section_header": [],
"authors": [],
"abstract": [],
"rotated_container": [],
"table_cell": [
"split",
"spanning"
]
}
WordData object
Returned when include_word_data is set to true. Contains information about all word-level elements detected in the image.
type One of text, math, table, diagram, equation_number
subtype Either not set, or chemistry, or triangle (more diagram subtypes coming soon)
cnt Contour for word expressed as list of (x,y) pixel coordinate pairs
text Text (Mathpix Markdown) for word
latex Math mode LaTeX (Mathpix Markdown) for word
confidence Estimated probability 100% correct
confidence_rate Estimated confidence of output quality
Auto rotation
The auto rotation feature detects when images are in the wrong orientation and corrects them before processing.
Control the confidence threshold with the auto_rotate_confidence_threshold request parameter (number in [0,1]). Default is 0.99, meaning the image is rotated only when the algorithm is 99% confident. Set to 1 to disable auto rotation.
The response includes:
auto_rotate_confidence— confidence that the image needs rotation (number in [0,1], ~0 if correct, ~1 if rotated)auto_rotate_degrees— rotation angle applied (one of 0, 90, -90, 180)