ocrmypdf-auto

ocrmypdf-auto

cmccambridge

[p]This container monitors an input file directory for PDF documents to process, and automatically invokes [a href='https://github.com/jbarlow83/OCRmyPDF'][code][strong]OCRmyPDF[/strong][/code][/a] on each file.[/p] [p]It uses [code]inotify[/code] to monitor the input directory efficiently, and is fairly configurable.[/p] [h4]Configuration Details[/h4] [p]See the descriptions of the unRAID volumes and environment variables for highlights of the configurability of [code]ocrmypdf-auto[/code], but for details including how to specify custom commandline parameters to [code]ocrmydf[/code] itself, or custom [code]tesseract[/code] configuration files, see the full README at [a href='https://github.com/cmccambridge/ocrmypdf-auto/blob/master/README.md']https://github.com/cmccambridge/ocrmypdf-auto/blob/master/README.md[/a][/p]

OtherFree·335.0K15y ago

About

ocrmypdf-auto is a containerized automation tool that processes PDF documents by applying optical character recognition (OCR). It continuously monitors an input directory for new PDF files using inotify, processes them with OCRmyPDF and Tesseract-OCR, then outputs searchable PDFs…

Deployment Options

1 stack

You might also like