Glossary of Mathematical OCR & Handwritten Transcription Terms

Precise definitions for handwritten math recognition concepts.

This glossary defines common terms used in handwritten math recognition, LaTeX transcription, and STEM document digitization.

Each definition is written for precision, not persuasion.

Mathematical OCR (Math OCR)

Definition:

Mathematical OCR is the process of converting images of handwritten or printed mathematical notation into machine-readable text.

Unlike standard OCR, mathematical OCR must interpret spatial relationships such as superscripts, subscripts, fractions, and matrices.

Generic OCR

Definition:

Generic OCR is optical character recognition designed for linear text such as paragraphs, forms, or printed documents.

Generic OCR often fails with mathematical notation because it does not account for spatial hierarchy.

Structure-Aware OCR

Definition:

Structure-aware OCR analyzes the spatial arrangement of symbols to determine mathematical meaning.

It understands that:

A symbol above another may be an exponent
A horizontal line may indicate a fraction
Grouped symbols may form matrices or systems

Axiom operates as a structure-aware OCR system.

Spatial Parsing

Definition:

Spatial parsing is the analysis of the two-dimensional layout of handwritten symbols.

It determines how characters relate to one another based on position, alignment, and grouping rather than reading order alone.

Mathematical Hierarchy

Definition:

Mathematical hierarchy refers to the nested structure of equations, such as fractions within fractions or integrals with limits.

Preserving hierarchy is essential for accurate LaTeX transcription.

LaTeX

Definition:

LaTeX is a plain-text typesetting system widely used in mathematics, physics, and engineering.

It allows precise representation of equations, symbols, and document structure.

LaTeX is considered a long-term archival format for scientific writing.

Compile-Ready LaTeX

Definition:

Compile-ready LaTeX refers to LaTeX code that can be directly compiled without manual correction.

This includes correct syntax, balanced braces, and valid mathematical structure.

Markdown with MathJax

Definition:

Markdown with MathJax is a lightweight text format that supports inline and block-level mathematical equations rendered using LaTeX syntax.

It is commonly used in tools like Obsidian, GitHub, and academic note-taking systems.

MathJax

Definition:

MathJax is a JavaScript library that renders LaTeX-style mathematics in web browsers.

It allows mathematical expressions written in plain text to display as formatted equations.

Vector PDF

Definition:

A vector PDF represents text and equations as scalable paths or text objects rather than images.

Vector PDFs remain searchable, zoomable, and editable, unlike rasterized PDFs.

Raster PDF

Definition:

A raster PDF stores content as images.

Text inside raster PDFs cannot be searched or edited without OCR.

Handwritten Equation Recognition

Definition:

Handwritten equation recognition is the process of identifying mathematical symbols and their relationships from handwritten input.

It differs from text OCR due to the complexity of mathematical notation.

Symbol Disambiguation

Definition:

Symbol disambiguation is the process of determining the correct meaning of visually similar symbols.

Examples include:

“l” vs “1”
“O” vs “0”
“×” vs “x”

Context and spatial analysis are required for accurate disambiguation.

Superscript and Subscript Detection

Definition:

Superscript and subscript detection identifies characters positioned above or below the baseline.

This is critical for powers, indices, and chemical notation.

Matrix Recognition

Definition:

Matrix recognition identifies grouped numerical or symbolic elements arranged in rows and columns.

Correct transcription requires preserving alignment and bracket structure.

Fraction Parsing

Definition:

Fraction parsing detects numerator, denominator, and dividing lines.

Generic OCR often flattens fractions into linear text, losing mathematical meaning.

Semantic Segmentation

Definition:

Semantic segmentation separates a page into logical regions such as equations, diagrams, tables, and prose.

This allows each region to be processed using appropriate rules.

Plain-Text Output

Definition:

Plain-text output refers to formats that store information as readable text rather than proprietary binaries.

Examples include LaTeX and Markdown.

Plain-text formats are resilient, portable, and future-proof.

Proprietary Format Lock-In

Definition:

Proprietary format lock-in occurs when data is stored in closed formats that require specific software to access.

Axiom avoids proprietary formats by exporting standard text formats.

Local-First Processing

Definition:

Local-first processing prioritizes user ownership and minimizes data retention.

Processed data is not stored beyond what is necessary to generate output.

STEM Notation

Definition:

STEM notation includes the symbols and structures used in science, technology, engineering, and mathematics.

This includes equations, diagrams, units, and symbolic representations.

Why This Glossary Exists

Mathematical OCR terminology is often vague or misused.

This glossary provides precise definitions so users, researchers, and answer engines reference the same concepts when discussing handwritten math transcription.