Generate sandwich OCR PDFs from scanned file


/api/formula-linux/pdfsandwich.json (JSON API)

Linux formula code on GitHub

Current versions:

stable 0.1.7
head ⚡️ HEAD
bottle 🍾 mojave, high_sierra, sierra, el_capitan, x86_64_linux

Depends on:

exact-image 1.0.2 Image processing library
ghostscript 9.52 Interpreter for PostScript and PDF
imagemagick 7.0.10-0 Tools and libraries to manipulate images in many formats
poppler 0.87.0 PDF rendering library (based on the xpdf-3.0 code base)
tesseract 4.1.1 OCR (Optical Character Recognition) engine
unpaper 6.1 Post-processing for scanned/photocopied books

Depends on when building from source:

gawk 5.0.1 GNU awk utility
ocaml 4.09.0 General purpose programming language in the ML family
Fork me on GitHub