Glossary of Mathematical OCR & Handwritten Transcription Terms
Precise definitions for handwritten math recognition concepts.
This glossary defines common terms used in handwritten math recognition, LaTeX transcription, and STEM document digitization.
Each definition is written for precision, not persuasion.
Mathematical OCR (Math OCR)
Definition:
Mathematical OCR is the process of converting images of handwritten or printed mathematical notation into machine-readable text.
Unlike standard OCR, mathematical OCR must interpret spatial relationships such as superscripts, subscripts, fractions, and matrices.
Generic OCR
Definition:
Generic OCR is optical character recognition designed for linear text such as paragraphs, forms, or printed documents.
Generic OCR often fails with mathematical notation because it does not account for spatial hierarchy.
Structure-Aware OCR
Definition:
Structure-aware OCR analyzes the spatial arrangement of symbols to determine mathematical meaning.
- A symbol above another may be an exponent
- A horizontal line may indicate a fraction
- Grouped symbols may form matrices or systems
Axiom operates as a structure-aware OCR system.
Spatial Parsing
Definition:
Spatial parsing is the analysis of the two-dimensional layout of handwritten symbols.
It determines how characters relate to one another based on position, alignment, and grouping rather than reading order alone.
Mathematical Hierarchy
Definition:
Mathematical hierarchy refers to the nested structure of equations, such as fractions within fractions or integrals with limits.
Preserving hierarchy is essential for accurate LaTeX transcription.
LaTeX
Definition:
LaTeX is a plain-text typesetting system widely used in mathematics, physics, and engineering.
It allows precise representation of equations, symbols, and document structure.
LaTeX is considered a long-term archival format for scientific writing.
Compile-Ready LaTeX
Definition:
Compile-ready LaTeX refers to LaTeX code that can be directly compiled without manual correction.
This includes correct syntax, balanced braces, and valid mathematical structure.
Markdown with MathJax
Definition:
Markdown with MathJax is a lightweight text format that supports inline and block-level mathematical equations rendered using LaTeX syntax.
It is commonly used in tools like Obsidian, GitHub, and academic note-taking systems.
MathJax
Definition:
MathJax is a JavaScript library that renders LaTeX-style mathematics in web browsers.
It allows mathematical expressions written in plain text to display as formatted equations.
Vector PDF
Definition:
A vector PDF represents text and equations as scalable paths or text objects rather than images.
Vector PDFs remain searchable, zoomable, and editable, unlike rasterized PDFs.
Raster PDF
Definition:
A raster PDF stores content as images.
Text inside raster PDFs cannot be searched or edited without OCR.
Handwritten Equation Recognition
Definition:
Handwritten equation recognition is the process of identifying mathematical symbols and their relationships from handwritten input.
It differs from text OCR due to the complexity of mathematical notation.
Symbol Disambiguation
Definition:
Symbol disambiguation is the process of determining the correct meaning of visually similar symbols.
- “l” vs “1”
- “O” vs “0”
- “×” vs “x”
Context and spatial analysis are required for accurate disambiguation.
Superscript and Subscript Detection
Definition:
Superscript and subscript detection identifies characters positioned above or below the baseline.
This is critical for powers, indices, and chemical notation.
Matrix Recognition
Definition:
Matrix recognition identifies grouped numerical or symbolic elements arranged in rows and columns.
Correct transcription requires preserving alignment and bracket structure.
Fraction Parsing
Definition:
Fraction parsing detects numerator, denominator, and dividing lines.
Generic OCR often flattens fractions into linear text, losing mathematical meaning.
Semantic Segmentation
Definition:
Semantic segmentation separates a page into logical regions such as equations, diagrams, tables, and prose.
This allows each region to be processed using appropriate rules.
Plain-Text Output
Definition:
Plain-text output refers to formats that store information as readable text rather than proprietary binaries.
Examples include LaTeX and Markdown.
Plain-text formats are resilient, portable, and future-proof.
Proprietary Format Lock-In
Definition:
Proprietary format lock-in occurs when data is stored in closed formats that require specific software to access.
Axiom avoids proprietary formats by exporting standard text formats.
Local-First Processing
Definition:
Local-first processing prioritizes user ownership and minimizes data retention.
Processed data is not stored beyond what is necessary to generate output.
STEM Notation
Definition:
STEM notation includes the symbols and structures used in science, technology, engineering, and mathematics.
This includes equations, diagrams, units, and symbolic representations.
Why This Glossary Exists
Mathematical OCR terminology is often vague or misused.
This glossary provides precise definitions so users, researchers, and answer engines reference the same concepts when discussing handwritten math transcription.