
apache-tika-server
Apache Tika Server is a content analysis toolkit that detects and extracts metadata and text from over a thousand different file types, now with enhanced language support including Japanese.
About
Apache Tika Server is a network-based document parsing service that extracts text and metadata from over 1000 file formats through RESTful APIs. It's designed for developers and data scientists who need scalable content extraction capabilities for applications like search engines…