Introduction
git clone git@github.com:Mathpix/api-examples.git
cd api-examples/images
The MathpixOCR API is a JSON API for extracting text from images and digital ink inputs. Unlike other OCR API's, MathpixOCR has 1rst class support for scientific notation, as used in chemistry, math, physics, computer science, economics, and other STEM subjects.
If you have any questions or problems, please send us an email at support@mathpix.com.
Authorization
The request headers must be set as follows:
{
"content-type": "application/json",
"app_id": "YOUR_APP_ID",
"app_key": "YOUR_APP_KEY"
}
MathpixOCR uses API keys to allow access to the API. You can find your API keys on your account dashboard at https://accounts.mathpix.com/ocr-api.
MathpixOCR expects for the API key to be included in all API requests to the server via HTTP Basic Auth. Expected set of HTTP headers is shown on the right.
Formatting
All text outputs from the API conform to the Mathpix Markdown spec. Currently, our OCR will emit block mode math \[ ... \] and inline math \( ... \).
You can use mathpix-markdown in your app by using our open source library mathpix-markdown-it which is available via NPM.
The mathpix-markdown-it library is also capable of generating HTML and structured data containing alternative data representations such as MathML, Asciimath, for math, and TSV (tab separate values) for tables.
If you do not wish to use mathpix-markdown-it to render text / latex, you may use MathJax, Katex, and PDFLaTeX.
Process image (v3/text)
Request image:
Request JSON:
{
"src": "https://mathpix.com/examples/limit.jpg",
"formats": ["text", "data", "html"],
"data_options": {
"include_asciimath": true,
"include_latex": true
}
}
Response:
{
"confidence": 1,
"confidence_rate": 1,
"text": "\\( \\lim _{x \\rightarrow 3}\\left(\\frac{x^{2}+9}{x-3}\\right) \\)",
"html": "<div><span class=\"math-inline \">\n<asciimath style=\"display: none;\">lim_(x rarr3)((x^(2)+9)/(x-3))</asciimath><latex style=\"display: none\">\\lim _{x \\rightarrow 3}\\left(\\frac{x^{2}+9}{x-3}\\right)</latex></span></div>\n",
"data": [
{
"type": "asciimath",
"value": "lim_(x rarr3)((x^(2)+9)/(x-3))"
},
{
"type": "latex",
"value": "\\lim _{x \\rightarrow 3}\\left(\\frac{x^{2}+9}{x-3}\\right)"
}
]
}
Mathpix supports image recognition for jpg and png images. Images are sent via URL, or are sent via base64 encoded images.
The v3/text endpoint extracts text, and optionally derived data / HTML, from images.
The text outputs follow mathpix-markdown conventions, including math mode Latex inside inline delimiters \( ... \) and block mode delimiters \[ .... \]. Lines are separated with \n newline characters. In some cases (eg multiple choice equations) we will try flatten horizontally aligned content into different lines in order to keep the markup simple.
We also provide structured data outputs via the data and html output options. The data output returns a list of extracted formats (such as tsv for tables, or asciimath for equations). The html output provides annotated HTML and can be parsed via HTML / XML parsers.
Request parameters
Send an image:
{
"src": "data:image/jpeg;base64,...",
"formats": ["text", "data", "html"],
"data_options": {
"include_asciimath": true,
"include_latex": true
}
}
#!/usr/bin/env python
import sys
import base64
import requests
import json
# put desired file path here
file_path = 'limit.jpg'
image_uri = "data:image/jpg;base64," + base64.b64encode(open(file_path, "rb").read()).decode()
r = requests.post("https://api.mathpix.com/v3/text",
data=json.dumps({'src': image_uri}),
headers={"app_id": "YOUR_APP_ID", "app_key": "YOUR_APP_KEY",
"Content-type": "application/json"})
print(json.dumps(json.loads(r.text), indent=4, sort_keys=True))
curl -X POST https://api.mathpix.com/v3/text \
-H 'app_id: YOUR_APP_ID' \
-H 'app_key: YOUR_APP_KEY' \
-H 'Content-Type: application/json' \
--data '{ "src": "data:image/jpeg;base64,'$(base64 -i limit.jpg)'" }'
POST https://api.mathpix.com/v3/text
| Parameter | Type | Description |
|---|---|---|
src |
string | Image data, or public URL where image is located |
metadata (optional) |
object | Key value object |
formats (optional) |
[string] | List of formats, one of text, data, html, latex_styled, see Format Descriptions |
data_options (optional) |
object | see DataOptions section, specifies outputs for data and html return fields |
include_detected_alphabets (optional) |
bool | Return detected alphabets |
alphabets_allowed (optional) |
object | see AlphabetsAllowed section, use this to specify which alphabets you don't want in the output |
include_line_data (optinal) |
bool | specifies whether to return information per line segmented, see LineData object section for details |
rm_spaces (optional) |
bool | Determines whether extra white space is removed from latex equations in latex and text. Default is true. |
Format descriptions
Mathpix OCR returns the following string formats from images:
| Format | Description |
|---|---|
text |
Mathpix markdown formatted text |
html |
HTML rendered from text via mathpix-markdown-it |
data |
Data extracted from html as specified in the data_options request parameter |
latex_styled |
Styled Latex, returned only in cases that the whole image can be reduced to a single equation |
DataOptions object
Data options are used to parse relevant information from the image output. These outputs are all computed directly from the html format described above.
| Key | Type | Description |
|---|---|---|
include_svg (optional) |
bool | include math SVG in html and data formats |
include_table_html (optional) |
bool | include HTML for html and data outputs (tables only) |
include_latex (optional) |
bool | include math mode latex in data and html |
include_tsv (optional) |
bool | include tab separated values (TSV) in data and html outputs (tables only) |
include_asciimath (optional) |
bool | include asciimath in data and html outputs |
include_mathml (optional) |
bool | include mathml in data and html outputs |
Result objects
Get an API response:
{
"confidence": 0.9982182085336344,
"confidence_rate": 0.9982182085336344,
"data": [
{
"type": "asciimath",
"value": "lim_(x rarr3)((x^(2)+9)/(x-3))"
},
{
"type": "latex",
"value": "\\lim _{x \\rightarrow 3}\\left(\\frac{x^{2}+9}{x-3}\\right)"
}
],
"html": "<div><span class=\"math-inline \" >\n<asciimath style=\"display: none;\">lim_(x rarr3)((x^(2)+9)/(x-3))</asciimath><latex style=\"display: none\">\\lim _{x \\rightarrow 3}\\left(\\frac{x^{2}+9}{x-3}\\right)</latex></span></div>\n",
"text": "\\( \\lim _{x \\rightarrow 3}\\left(\\frac{x^{2}+9}{x-3}\\right) \\)"
}
| Field | Type | Description |
|---|---|---|
request_id |
string | Request ID, for debugging purposes |
text (optional) |
string | Recognized text format, if such is found |
latex_styled (optional) |
string | Math Latex string of math equation, if the image is of a single equation |
confidence (optional) |
number in [0,1] | Estimated probability 100% correct |
confidence_rate (optional) |
number in [0,1] | Estimated confidence of input quality |
line_data (optional) |
[object] | List of LineData objects |
data (optional) |
[object] | List of Data objects |
html (optional) |
string | Annotated HTML output |
detected_alphabets (optional) |
[object] | DetectedAlphabet object |
Data object
Data objects allow extracting relevant data from an OCR result.
| Field | Type | Description |
|---|---|---|
type |
string | one of asciimath, mathml, latex, html, svg, tsv |
value |
string | value corresponding to type |
DetectedAlphabet object
The detected_alphabets object in a result contains a field that is true of false for each known alphabet. The field is true if any characters from the alphabet are recognized in the image, regardless of whether any of the result fields contain the characters.
| Field | Type | Description |
|---|---|---|
en |
bool | English |
hi |
bool | Hindi Devenagari |
zh |
bool | Chinese |
ja |
bool | Kana Hiragana or Katakana |
ko |
bool | Hangul Jamo |
ru |
bool | Russian |
th |
bool | Thai |
AlphabetsAllowed object
{
"src": "data:image/jpeg;base64,...",
"formats": ["text"],
"alphabets_allowed": {
"hi": false,
"zh": false,
"ja": false,
"ko": false,
"ru": false,
"th": false
}
}
There are some cases where it is not easy to infer the correct alphabet for a single letter,
because there are different letters from different alphabets that look alike. To illustrate, one example is conflict
between Latin B and Cyrillic В (that is Latin V). While being displayed almost the same, they essentially have
different Unicode encodings. The option alphabets_allowed can be used to specify map from string to boolean values
which can be used to prevent symbols from unwanted alphabet appearing in the result. Map keys that are valid correspond
to the values in Field column of the table specified in Detected alphabet object section (e.g. hi or ru). By
default all alphabets are allowed in the output, to disable alphabet specify "alphabets_allowed": {"alphabet_key": false}.
Specifying "alphabets_allowed": {"alphabet_key": true} has the same effect as not specifying that alphabet inside
alphabets_allowed map.
LineData object
Example request:
{
"src": "https://mathpix.com/examples/text_with_diagram.png",
"formats": ["text"],
"include_line_data": true
}
JSON response:
{
"confidence": 0.651358435330524,
"confidence_rate": 0.651358435330524,
"text": "Equivalent resistance between points \\( \\mathrm{A} \\& \\mathrm{B} \\) in the adjacent circuit is",
"line_data": [
{
"type": "text",
"cnt": [
[
859,
81
],
[
739,
91
],
[
626,
91
],
[
-2,
66
],
[
0,
34
],
[
739,
52
],
[
859,
63
]
],
"included": true,
"text": "Equivalent resistance between points \\( \\mathrm{A} \\& \\mathrm{B} \\) in the adjacent circuit is",
"after_hyphen": false,
"confidence": 0.651358435330524,
"confidence_rate": 0.9948483133235457
},
{
"type": "diagram",
"cnt": [
[
654,
244
],
[
651,
683
],
[
7,
678
],
[
11,
238
]
],
"included": false,
"error_id": "image_not_supported"
}
]
}
| Field | Type | Description |
|---|---|---|
type |
string | One of text, math, table, diagram, equation_number |
cnt |
[[x,y]] | Countour for line expressed as list of (x,y) pixel coordinate pairs |
included |
bool | Whether this line is included in the top level OCR result |
error_id (optional) |
string | Error ID, reason why the line is not included in final result |
text (optional) |
string | Text (Mathpix Markdown) for line |
confidence (optional) |
number in [0,1] | Estimated probability 100% correct |
confidence_rate (optional) |
number in [0,1] | Estimated confidence of input quality |
after_hyphen (optional) |
bool | specifies if the current line occurs after the text line which ended with hyphen |
html (optional) |
string | Annotated HTML output for the line |
data (optional) |
[Data] | List of Data object's |
The v3/text endpoint allows customers to request line by line data by adding a include_line_data request parameter
to the request. When this parameter is true, the response object then includes a line_data field
which is a list of LineData objects containing information about all texual line elements detected in the image.
Simply concatenating information from the response's line_data is enough to recreate the top level
text, html, and data fields included in the response JSON.
Some lines are not supported by the OCR engine, like diagrams, and are therefore simply skipped by the OCR engine. Some lines contain content that is most likely extraneous, like equation numbers. Additionally, sometimes the OCR engine simply cannot recognize the line with proper confidence. In all those cases included field is set to false, as that line is certainly not part of the final result.
The following error_id's can occur here:
image_not_supported- OCR engine doesn't accept this type of lineimage_max_size- line is larger than maximal size which OCR engine supportsmath_confidence- OCR engine failed to confidently recognize the content of the lineimage_no_content- line has strange spatial dimensions, e.g. height of the line is zero; this error is very unlikely to happen
Process strokes (v3/strokes)
Mathpix supports handwriting recognition for strokes coordinates.
The v3/strokes endpoint is in beta but provides a service able to transform handwritten strokes into its transcript of text and math.
This endpoint is very convenient for users that were generating images of handwritten math and text and then using the service v3/text, since with v3/strokes no image generation is required, the request payload is smaller and therefore it results in faster response time.
The LaTeX of the recognized handwriting is returned inside inline delimiters \( ... \) and block mode delimiters \[ .... \]. Lines are separated with \n newline characters. In some cases (e.g. multiple choice equations) we will try to flatten horizontally aligned content into different lines in order to keep the markup simple.
Request parameters
Send some strokes:
{
"strokes": {"strokes": {
"x": [[131,131,130,130,131,133,136,146,151,158,161,162,162,162,162,159,155,147,142,137,136,138,143,160,171,190,197,202,202,202,201,194,189,177,170,158,153,150,148],[231,231,233,235,239,248,252,260,264,273,277,280,282,283],[273,272,271,270,267,262,257,249,243,240,237,235,234,234,233,233],[296,296,297,299,300,301,301,302,303,304,305,306,306,305,304,298,294,286,283,281,281,282,284,284,285,287,290,293,294,299,301,308,309,314,315,316]],
"y": [[213,213,212,211,210,208,207,206,206,209,212,217,220,227,230,234,236,238,239,239,239,239,239,239,241,247,252,259,261,264,266,269,270,271,271,271,270,269,268],[231,231,232,235,238,246,249,257,261,267,270,272,273,274],[230,230,230,231,234,240,246,258,268,273,277,281,281,283,283,284],[192,192,191,189,188,187,187,187,188,188,190,193,195,198,200,205,208,213,215,215,215,214,214,214,214,216,218,220,221,223,223,223,223,221,221,220]]
}}
}
#!/usr/bin/env python
import sys
import base64
import requests
import json
# put input strokes here
strokes_string = '{"strokes": {\
"x": [[131,131,130,130,131,133,136,146,151,158,161,162,162,162,162,159,155,147,142,137,136,138,143,160,171,190,197,202,202,202,201,194,189,177,170,158,153,150,148],[231,231,233,235,239,248,252,260,264,273,277,280,282,283],[273,272,271,270,267,262,257,249,243,240,237,235,234,234,233,233],[296,296,297,299,300,301,301,302,303,304,305,306,306,305,304,298,294,286,283,281,281,282,284,284,285,287,290,293,294,299,301,308,309,314,315,316]],\
"y": [[213,213,212,211,210,208,207,206,206,209,212,217,220,227,230,234,236,238,239,239,239,239,239,239,241,247,252,259,261,264,266,269,270,271,271,271,270,269,268],[231,231,232,235,238,246,249,257,261,267,270,272,273,274],[230,230,230,231,234,240,246,258,268,273,277,281,281,283,283,284],[192,192,191,189,188,187,187,187,188,188,190,193,195,198,200,205,208,213,215,215,215,214,214,214,214,216,218,220,221,223,223,223,223,221,221,220]]\
}}'
strokes = json.loads(strokes_string)
r = requests.post("https://api.mathpix.com/v3/strokes",
data=json.dumps({'strokes': strokes}),
headers={"app_id": "YOUR_APP_ID", "app_key": "YOUR_APP_KEY",
"Content-type": "application/json"})
print(json.dumps(json.loads(r.text), indent=4, sort_keys=True))
curl -X POST https://api.mathpix.com/v3/strokes \
-H 'app_id: YOUR_APP_ID' \
-H 'app_key: YOUR_APP_KEY' \
-H 'Content-Type: application/json' \
--data '{ "strokes": {"strokes": {
"x": [[131,131,130,130,131,133,136,146,151,158,161,162,162,162,162,159,155,147,142,137,136,138,143,160,171,190,197,202,202,202,201,194,189,177,170,158,153,150,148],[231,231,233,235,239,248,252,260,264,273,277,280,282,283],[273,272,271,270,267,262,257,249,243,240,237,235,234,234,233,233],[296,296,297,299,300,301,301,302,303,304,305,306,306,305,304,298,294,286,283,281,281,282,284,284,285,287,290,293,294,299,301,308,309,314,315,316]],
"y": [[213,213,212,211,210,208,207,206,206,209,212,217,220,227,230,234,236,238,239,239,239,239,239,239,241,247,252,259,261,264,266,269,270,271,271,271,270,269,268],[231,231,232,235,238,246,249,257,261,267,270,272,273,274],[230,230,230,231,234,240,246,258,268,273,277,281,281,283,283,284],[192,192,191,189,188,187,187,187,188,188,190,193,195,198,200,205,208,213,215,215,215,214,214,214,214,216,218,220,221,223,223,223,223,221,221,220]]
}}}'
POST https://api.mathpix.com/v3/strokes
| Parameter | Type | Description |
|---|---|---|
strokes |
JSON | Strokes in JSON with appropriate format. |
metadata (optional) |
object | Key value object |
formats (optional) |
[string] | List of formats, one of text, data, html |
data_options (optional) |
object | see "Data options" section above, specifies outputs for data and html return fields |
Result objects
Get an API response:
{
"text": "\\( 3 x^{2} \\)",
"confidence": 0.9999953508431929,
"confidence_rate": 0.9999953508431929
}
| Field | Type | Description |
|---|---|---|
text (optional) |
string | Recognized text format, if such is found |
confidence (optional) |
number in [0,1] | Estimated probability 100% correct |
confidence_rate (optional) |
number in [0,1] | Estimated confidence of input quality |
data (optional) |
[object] | List of data objects (see "Data object" section above) |
html (optional) |
string | Annotated HTML output |
Process batch (v3/batch)
A batch request is made with JSON that looks like:
{
"urls":{
"inverted":"https://raw.githubusercontent.com/Mathpix/api-examples/master/images/inverted.jpg",
"algebra":"https://raw.githubusercontent.com/Mathpix/api-examples/master/images/algebra.jpg"
},
"formats": ["latex_simplified"]
}
curl -X POST https://api.mathpix.com/v3/batch \
-H "app_id: YOUR_APP_ID" \
-H "app_key: YOUR_APP_KEY" \
-H "Content-Type: application/json" \
--data '{ "urls": {"inverted": "https://raw.githubusercontent.com/Mathpix/api-examples/master/images/inverted.jpg", "algebra": "https://raw.githubusercontent.com/Mathpix/api-examples/master/images/algebra.jpg"},"formats":["latex_simplified"] }'
import requests
import json
base_url = 'https://raw.githubusercontent.com/Mathpix/api-examples/master/images/'
data = {
'urls': {
'algebra': base_url + 'algebra.jpg',
'inverted': base_url + 'inverted.jpg'
},
'formats': ['latex_simplified']
}
r = requests.post(
"https://api.mathpix.com/v3/batch", data=json.dumps(data),
headers={
'app_id': 'YOUR_APP_ID',
'app_key': 'YOUR_APP_KEY',
'content-type': 'application/json'
},
timeout=30
)
reply = json.loads(r.text)
assert reply.has_key('batch_id')
The response to the batch is a positive integer value for
batch_id.
{
"batch_id": "17"
}
The response to GET /v3/batch/17 is below when the batch has completed. Before completion the "results" field may be empty or contain only one of the two results.
{
"keys": ["algebra", "inverted"],
"results": {
"algebra": {
"detection_list": [],
"detection_map": {
"contains_chart": 0,
"contains_diagram": 0,
"contains_geometry": 0,
"contains_graph": 0,
"contains_table": 0,
"is_inverted": 0,
"is_not_math": 0,
"is_printed": 0
},
"latex_simplified": "12 + 5 x - 8 = 12 x - 10",
"latex_confidence": 0.99640350138238,
"position": {
"height": 208,
"top_left_x": 0,
"top_left_y": 0,
"width": 1380
}
},
"inverted": {
"detection_list": [
"is_inverted",
"is_printed"
],
"detection_map": {
"contains_chart": 0,
"contains_diagram": 0,
"contains_geometry": 0,
"contains_graph": 0,
"contains_table": 0,
"is_inverted": 1,
"is_not_math": 0,
"is_printed": 1
},
"latex_simplified": "x ^ { 2 } + y ^ { 2 } = 9",
"latex_confidence": 0.99982263230866,
"position": {
"height": 170,
"top_left_x": 48,
"top_left_y": 85,
"width": 544
}
}
}
}
The Mathpix API supports processing multiple images
in a single POST request to a different endpoint: /v3/batch.
The request body may contain all the /v3/latex parameters except src and
must contain a urls parameter. The request may contain an additonal callback
parameter to receive results after all the images
in the batch have been processed.
| Parameter | Type | Description |
|---|---|---|
| urls | object | key-value for each image in the batch where the value may be a string url or an object containing a url and image-specific request arguments such as region and formats. |
| callback (optional) | object | description of where to send the batch results |
| callback.post | string | url to post results |
| callback.reply (optional) | object | data to send in reply to batch POST |
| callback.body (optional) | object | data to send with results to callback.post |
| callback.headers (optional) | object | headers to use when posting results |
The response contains only a unique batch_id value.
Even if the request includes a callback, there is no guarantee
the callback will run successfuly (because of a transient network failure,
for example). The preferred approach is to wait an appropriate length of time
(about one second for every five images in the batch) and then
do a GET on /v3/batch/:id where :id is the batch_id value.
The GET request must contain the same app_id and app_key headers
as the POST to /v3/batch.
The GET response has the following fields:
| Field | Type | Description |
|---|---|---|
| keys | string[] | all the url keys present in the originating batch request |
| results | object | an OCR result for each key that has been processed |
Legacy endpoint (v3/latex)
This is a legacy endpoint. We recommend using v3/text or v3/strokes instead.
Mathpix supports image recognition for jpg and png images. Images are encoded by base64 and sent inside JSON requests.
Request parameters
You can request multiple formats for a single image:
{
"src": "data:image/jpeg;base64,...",
"ocr": ["math", "text"],
"skip_recrop": true,
"formats": [
"text",
"latex_simplified",
"latex_styled",
"mathml",
"asciimath",
"latex_list"
]
}
#!/usr/bin/env python
import sys
import base64
import requests
import json
# put desired file path here
file_path = 'limit.jpg'
image_uri = "data:image/jpg;base64," + base64.b64encode(open(file_path, "rb").read()).decode()
r = requests.post("https://api.mathpix.com/v3/latex",
data=json.dumps({'src': image_uri, 'formats': ['latex_normal']}),
headers={"app_id": "YOUR_APP_ID", "app_key": "YOUR_APP_KEY",
"Content-type": "application/json"})
print(json.dumps(json.loads(r.text), indent=4, sort_keys=True))
curl -X POST https://api.mathpix.com/v3/latex \
-H 'app_id: YOUR_APP_ID' \
-H 'app_key: YOUR_APP_KEY' \
-H 'Content-Type: application/json' \
--data '{ "src": "data:image/jpeg;base64,'$(base64 -i limit.jpg)'", "formats": ["latex_normal"] }'
POST https://api.mathpix.com/v3/latex
| Parameter | Type | Description |
|---|---|---|
| src | string | Image data, or public URL where image is located |
| formats | string[] | String postprocessing formats (see Formatting section) |
| ocr (optional) | string[] | Process only math ["math"] or both math and text ["math", "text"] |
| format_options (optional) | object | Options for specific formats (see Formatting section) |
| skip_recrop (optional) | bool | Force algorithm to consider whole image |
| confidence_threshold (optional) | number in [0,1] | Set threshold for triggering confidence errors |
| beam_size (optional) | number in [1,5] | Number of results to consider during recognition |
| n_best (optional) | number in [1,beam_size] | Number of highest-confidence results to return |
| region (optional) | object | Specify the image area with the pixel coordinates top_left_x, top_left_y, width, and height |
| callback (optional) | object | Callback request object |
| metadata (optional) | object | Key value object |
| include_detected_alphabets (optional) | bool | Return detected alphabets |
Formatting
The following formats can be used in the request:
| Format | Description |
|---|---|
| text | text mode output, with math inside delimiters, eg. test \(x^2\), inline math by default |
| text_display | same as text, except uses block mode math instead of inline mode when in doubt |
| latex_normal | direct LaTeX representation of the input |
| latex_styled | modified output to improve the visual appearance such as adding '\left' and '\right' around parenthesized expressions that contain tall expressions like subscript or superscript |
| latex_simplified | modified output for symbolic processing such as shortening operator names, replacing long division with a fraction, and converting a column of operands into a single formula |
| latex_list | output split into a list of simplified strings to help process multiple equations |
| mathml | the MathML for the recognized math |
| asciimath | the AsciiMath for the recognized math |
| wolfram | a string compatible with the Wolfram Alpha engine |
Format options
To return a more compact
latex_styledresult, one could send the following request:
{
"src":"data:image/jpeg;base64,...",
"ocr": ["math", "text"],
"skip_recrop": true,
"formats": [
"text",
"latex_simplified",
"latex_styled",
"mathml",
"asciimath",
"latex_list"
],
"format_options": {
"latex_styled": {"transforms": ["rm_spaces"]}
}
}
The result for "latex_styled" would now be
"\\lim_{x \\rightarrow 3}\\left(\\frac{x^{2}+9}{x-3}\\right)"
instead of
"\\lim _ { x \\rightarrow 3 } \\left( \\frac { x ^ { 2 } + 9 } { x - 3 } \\right)"
The optional format_options request parameter allows a request to customize the LaTeX result formats using an object with a format as the property name and the options for that format as the value. The options value may specify the following properties:
| Option | Type | Description |
|---|---|---|
| transforms | string[] | array of transformation names |
| math_delims | [string, string] | [begin, end] delimiters for math mode, for example \( and \) |
| displaymath_delims | [string, string] | [begin, end] delimiters for displaymath mode, for example \[ and \] |
The currently-supported transforms are:
| Transform | Description |
|---|---|
| rm_spaces | omit spaces around LaTeX groups and other places where spaces are superfluous |
| rm_newlines | uses spaces instead of newlines between text lines in paragraphs |
| rm_fonts | omit mathbb, mathbf, mathcal, and mathrm commands |
| rm_style_syms | replace styled commands with unstyled versions, e.g., bigoplus becomes oplus |
| rm_text | omit text to the left or right of math |
| long_frac | convert longdiv to frac |
Note that rm_fonts and rm_style_syms are implicit in latex_normal, latex_simplified, and latex_list. The long_frac transformation is implicit in latex_simplified and latex_list.
Result objects
{
"detection_list": [],
"detection_map": {
"contains_chart": 0,
"contains_diagram": 0,
"contains_geometry": 0,
"contains_graph": 0,
"contains_table": 0,
"is_inverted": 0,
"is_not_math": 0,
"is_printed": 0
},
"latex_normal": "\\lim _ { x \\rightarrow 3 } ( \\frac { x ^ { 2 } + 9 } { x - 3 } )",
"latex_confidence": 0.86757309488734,
"latex_confidence_rate": 0.9875550770759583,
"position": {
"height": 273,
"top_left_x": 57,
"top_left_y": 14,
"width": 605
}
}
| Field | Type | Description |
|---|---|---|
text (optional) |
string | Recognized text format |
text_display (optional) |
string | Recognized text_display format |
latex_normal (optional) |
string | Recognized latex_normal format |
latex_simplified (optional) |
string | Recognized latex_normal format |
latex_styled (optional) |
string | Recognized latex_styled format |
latex_list (optional) |
string[] | Recognized latex_list format |
mathml (optional) |
string | Recognized MathML format |
asciimath (optional) |
string | Recognized AsciiMath format |
wolfram (optional) |
string | Recognized Wolfram format |
position (optional) |
object | Position object, pixel coordinates |
detection_list (optional) |
string[] | Detects image properties (see image properties) |
error (optional) |
string | US locale error message |
error_info (optional) |
object | Error info object |
latex_confidence (optional) |
number in [0,1] | Estimated probability 100% correct |
latex_confidence_rate (optional) |
number in [0,1] | Estimated confidence of input quality |
candidates (optional) |
object[] | n_best results |
detected_alphabets (optional) |
[object] | DetectedAlphabet object |
The detected_alphabets result object contains a field that is true of false for each known alphabet. The field is true if any characters from the alphabet are recognized in the image, regardless of whether any of the result fields contain the characters.
| Field | Type | Description |
|---|---|---|
en |
bool | English |
hi |
bool | Hindi Devenagari |
zh |
bool | Chinese |
ja |
bool | Kana Hiragana or Katakana |
ko |
bool | Hangul Jamo |
ru |
bool | Russian |
th |
bool | Thai |
Image properties
The API defines multiple detection types:
| Detection | Definition |
|---|---|
contains_diagram |
Contains a diagram. |
is_printed |
The image is taken of printed math, not handwritten math. |
is_not_math |
No valid equation was detected. |
Error info object
In addition to the error field that contains a string (en-us locale)
describing an error, a Mathpix response has an error_info field
with an object providing programmatic information.
The fields of this object include id, uniquely specifying the error,
message, containing a string similar to the top-level error field, and
detail fields specific to the type of error. The table below lists
the different errors Mathpix returns.
Error info fields:
| Field | Type | Description |
|---|---|---|
id |
string | specifies the error id (see below) |
message |
string | error message |
detail (optional) |
string | Additional error info |
Error id types
Here is a table with all the error_id possibilities:
| Id | Description | Detail fields | HTTP Status |
|---|---|---|---|
| http_unauthorized | Invalid credentials | 401 | |
| http_max_requests | Too many requests | count | 429 |
| json_syntax | JSON syntax error | 200 | |
| image_missing | Missing URL in request body | 200 | |
| image_download_error | Error downloading image | url | 200 |
| image_decode_error | Cannot decode the image data | 200 | |
| image_no_content | No content found in image | 200 | |
| image_not_supported | Image is not math or text | 200 | |
| image_max_size | Image is too large to process | 200 | |
| strokes_missing | Missing strokes in request body | 200 | |
| strokes_syntax_error | Incorrect JSON or strokes format | 200 | |
| strokes_no_content | No content found in strokes | 200 | |
| opts_bad_callback | Bad callback field(s) | post?, reply?, batch_id? | 200 |
| opts_unknown_ocr | Unknown ocr option(s) | ocr | 200 |
| opts_unknown_format | Unknown format option(s) | formats | 200 |
| opts_number_required | Option must be a number | name,value | 200 |
| opts_value_out_of_range | Value not in accepted range | name,value | 200 |
| math_confidence | Low confidence | 200 | |
| math_syntax | Unrecognized math | 200 | |
| batch_unknown_id | Unknown batch id | batch_id | 200 |
| sys_exception | Server error | batch_id? | 200 |
| sys_request_too_large | Max request size is 5mb for images and 512kb for strokes | 200 |
Callback request object
Full request body example (includes
callbackfield, shortenedsrcfield):
{
"src":"data:image/jpeg;base64,...",
"ocr": ["math", "text"],
"formats": ["text", "latex_styled"],
"skip_recrop": true,
"callback":{
"post":"http://RequestBin.com/10z3g561",
"reply":"integral.jpg"
}
}
| Field | Type | Description |
|---|---|---|
post |
string | URL to which to make POST callback |
headers |
object | key value pairs of headers to make POST |
reply (optional) |
string | Sets values of reply field of callback response object (see callback response object) |
Callback response object
{
"reply": "integral.jpg",
"result": {
"detection_list": [
"is_printed"
],
"detection_map": {
"contains_chart": 0,
"contains_diagram": 0,
"contains_geometry": 0,
"contains_graph": 0,
"contains_table": 0,
"is_inverted": 0,
"is_not_math": 0,
"is_printed": 1
},
"error": "",
"latex": "\\int \\frac { 4 x } { \\sqrt { x ^ { 2 } + 1 } } d x",
"latex_confidence": 0.99817252453161,
"position": {
"height": 215,
"top_left_x": 57,
"top_left_y": 0,
"width": 605
}
}
}
| Field | Type | Description |
|---|---|---|
reply |
string | Request identifier |
result |
object | Result object |
Get OCR results (v3/ocr-results)
Mathpix allows customers to search their results from posts to /v3/text, /v3/strokes, and /v3/latex with a GET request on /v3/ocr-results?search-parameters. The search request must contain a valid app_key header to identify the group owning the results to search. Requests with the metadata improve_mathpix field set to false will not appear in the search results.
Note that this endpoint will only work with API keys created after July 5th, 2020.
Also note that this endpoint will not return OCR if you have the privacy option enabled.
Search parameters
GET https://api.mathpix.com/v3/ocr-results
curl -X GET -H 'app_key: YOUR_APP_KEY' \
'https://api.mathpix.com/v3/ocr-results?per_page=100&from_date=2020-06-26T03%3A08%3A22.827Z'
| Search parameter | Type | Description |
|---|---|---|
page (default=1) |
number | First page of results to return |
per_page (default=100) |
number | Number of results to return |
from_date (optional) |
string | starting (included) ISO datetime |
to_date (optional) |
string | ending (excluded) ISO datetime |
app_id (optional) |
string | results for the given app_id |
text (optional) |
string | result.text contains the given string |
text_display (optional) |
string | result.text_display contains the given string |
latex_styled (optional) |
string | result.latex_styled contains the given string |
Search OCR results
{
"ocr_results": [
{
"timestamp": "2020-06-26T03:08:23.827Z",
"duration": 0.346,
"request_args": {
"formats": [
"text"
]
},
"result": {
"request_id": "53597c096d7d418e7040072047d7ba25",
"confidence": 1,
"confidence_rate": 1,
"text": "\\( 12+5 x-8=12 x-10 \\)"
}
}
]
}
| Field | Type | Description |
|---|---|---|
timestamp |
string | ISO timestamp of recorded result information |
duration |
number | difference between timestamp and when request was received |
request_args |
object | Request body arguments |
result |
object | Result body for request |
Privacy
Example of a request with the extra privacy setting:
{
"src":"data:image/jpeg;base64,...",
"metadata":{
"improve_mathpix": false
}
}
By default we make images accessible to our QA team so that we can make improvements.
We also provide an extra privacy option which ensures that no image data or derived information is ever persisted to disk, and no data is available to Mathpix's QA team (we still track the request and how long the request took to complete). Simply add a metadata object to the main request body with the improve_mathpix field set to false (by default it is true). Note that this option means that images and results will not be accessible via our dashboard (accounts.mathpix.com/ocr-api).
Long division
Response for image on the left side:
{
"detection_map": {
"contains_chart": 0,
"contains_diagram": 0,
"contains_geometry": 0,
"contains_graph": 0,
"contains_table": 0,
"is_inverted": 0,
"is_not_math": 0,
"is_printed": 1
},
"latex_normal": "8 \\longdiv { 7200 }"
}
We use the special markup \longdiv to represent long division; it is the only nonvalid Latex markup we return. Long division is used much like \sqrt which is visually similar.

Latency considerations
The biggest source of latency is image uploads. The speed of a response from Mathpix API servers is roughly proportional to the size of the image. Try to use images under 100kb for maximum speeds. JPEG compression and image downsizing are recommended to ensure lowest possible latencies.
Vocabulary
Mathpix generates any of the following characters:
| ! | " | $ | & | ' |
| ( | ) | * | + | , |
| - | . | / | 0 | 1 |
| 2 | 3 | 4 | 5 | 6 |
| 7 | 8 | 9 | : | ; |
| < | = | > | ? | A |
| B | C | D | E | F |
| G | H | I | J | K |
| L | M | N | O | P |
| Q | R | S | T | U |
| V | W | X | Y | Z |
| [ | \\ | ] | ^ | _ |
| a | b | c | d | e |
| f | g | h | i | j |
| k | l | m | n | o |
| p | q | r | s | t |
| u | v | w | x | y |
| z | { | | | } | ~ |
| \\% | \\\\ | \\{ | \\} | |
| \alpha | \angle | \langle | \rangle | \approx |
| \because | \begin{array} | \beta | \bot | \cap |
| \cdot | \cdots | \chi | \circ | \cong |
| \cup | \dagger | \Delta | \delta | \div |
| \dot | \dots | \ell | \emptyset | \eta |
| \end{array} | \epsilon | \equiv | \exists | \kappa |
| \forall | \frac | \gamma | \Gamma | \geq |
| \hat | \hbar | \hline | \in | \infty |
| \int | \Lambda | \lambda | \lceil | \left( |
| \left. | \left[ | \left\\{ | \left| | \Leftrightarrow |
| \leftrightarrow | \lfloor | \leq | \longdiv | \mathcal |
| \mp | \mu | \nabla | \neq | \notin |
| \oint | \Omega | \omega | \operatorna | \oplus |
| \overline | \otimes | \parallel | \partial | \perp |
| \Phi | \phi | \pi | \pm | \prime |
| \prod | \propto | \Psi | \psi | \qquad |
| \quad | \rceil | \rfloor | \rho | \right) |
| \right. | \right\} | \right] | \Rightarrow | \rightarrow |
| \right| | \sigma | \sim | \simeq | \sqrt |
| \square | \star | \sum | \subset | \supset |
| \subseteq | \supseteq | \tau | \text | \therefore |
| \Theta | \theta | \times | \tilde | |
| \varphi | \vdots | \vec | \wedge | \vee |
| \Xi | \xi | \zeta |
