textract

Install command:
brew install textract

Extract text from various different types of files

https://textract.readthedocs.io/

License: MIT

Formula JSON API: /api/formula/textract.json

Bottle JSON API: /api/bottle/textract.json

Formula code: textract.rb on GitHub

Bottle (binary package) installation support provided for:

Intel ventura
monterey
big sur
catalina
64-bit linux
Apple Silicon ventura
monterey
big sur

Current versions:

stable 1.6.5

Depends on:

antiword 0.37 Utility to read Word (.doc) files
flac 1.4.2 Free lossless audio codec
pillow 9.3.0 Friendly PIL fork (Python Imaging Library)
poppler 22.12.0 PDF rendering library (based on the xpdf-3.0 code base)
python@3.10 3.10.8 Interpreted, interactive, object-oriented programming language
six 1.16.0 Python 2 and 3 compatibility utilities
swig 4.1.1 Generate scripting interfaces to C/C++ code
tesseract 5.2.0 OCR (Optical Character Recognition) engine
unrtf 0.21.10 RTF to other formats converter

Analytics (macOS):

Installs (30 days)
textract 20
Installs on Request (30 days)
textract 20
Build Errors (30 days)
textract 0
Installs (90 days)
textract 73
Installs on Request (90 days)
textract 73
Installs (365 days)
textract 73
Installs on Request (365 days)
textract 73

Analytics (Linux):

Installs (30 days)
textract 1
Installs on Request (30 days)
textract 1
Build Errors (30 days)
textract 0
Installs (90 days)
textract 2
Installs on Request (90 days)
textract 2
Installs (365 days)
textract 2
Installs on Request (365 days)
textract 2
Fork me on GitHub