Adds an OCR text layer to scanned PDF files


/api/formula/ocrmypdf.json (JSON API)

Formula code on GitHub

Current versions:

stable 7.0.4
bottle 🍾 mojave, high_sierra, sierra, el_capitan

Depends on:

exempi 2.4.5 Library to parse XMP metadata
freetype 2.9.1 Software library to render fonts
ghostscript 9.25 Interpreter for PostScript and PDF
jbig2enc 0.29 JBIG2 encoder (for monochrome documents)
jpeg 9c Image manipulation library
leptonica 1.76.0 Image processing and image analysis library
libpng 1.6.35 Library for manipulating PNG images
pngquant 2.12.0 PNG image optimizing utility
python 3.7.0 Interpreted, interactive, object-oriented programming language
qpdf 8.2.1 Tools for and transforming and inspecting PDF files
tesseract 3.05.02 OCR (Optical Character Recognition) engine
unpaper 6.1 Post-processing for scanned/photocopied books

Depends on when building from source:

pkg-config 0.29.2 Manage compile and link flags for libraries


Installs (30 days)
ocrmypdf 495
Installs on Request (30 days)
ocrmypdf 492
Build Errors (30 days)
ocrmypdf 0
Installs (90 days)
ocrmypdf 1,105
Installs on Request (90 days)
ocrmypdf 1,108
Installs (365 days)
ocrmypdf 1,234
Installs on Request (365 days)
ocrmypdf 1,236
Fork me on GitHub