
ocrmypdf-auto
[p]This container monitors an input file directory for PDF documents to process, and automatically invokes [a href='https://github.com/jbarlow83/OCRmyPDF'][code][strong]OCRmyPDF[/strong][/code][/a] on each file.[/p] [p]It uses [code]inotify[/code] to monitor the input directory efficiently, and is fairly configurable.[/p] [h4]Configuration Details[/h4] [p]See the descriptions of the unRAID volumes and environment variables for highlights of the configurability of [code]ocrmypdf-auto[/code], but for details including how to specify custom commandline parameters to [code]ocrmydf[/code] itself, or custom [code]tesseract[/code] configuration files, see the full README at [a href='https://github.com/cmccambridge/ocrmypdf-auto/blob/master/README.md']https://github.com/cmccambridge/ocrmypdf-auto/blob/master/README.md[/a][/p]
About
ocrmypdf-auto is a containerized automation tool that processes PDF documents by applying optical character recognition (OCR). It continuously monitors an input directory for new PDF files using inotify, processes them with OCRmyPDF and Tesseract-OCR, then outputs searchable PDFs…