Metadata-Version: 2.1
Name: pmworker
Version: 1.0.0
Summary: Papermerge worker - extract OCR text documents
Home-page: https://github.com/ciur/papermerge-worker
Author: Eugen Ciur
Author-email: eugen@papermerge.com
License: Proprietary
Keywords: tesseract documentation tutorial
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown

Papermerge Worker
================

pmwroker's main job is OCR processing. It extracts text from pdf, tiff, jpeg and png.

Requirements
=============

python >= 3.6

pmworker.wrapper uses subprocess.run method, method added in python 3.5.
Also argument of subprocess.run(encoding='utf-8') is used. This argument
was added python 3.6

Dependencies
=============

Depends on celery, tesseract, imagemagick.

Usage:

> export CELERY_CONFIG_MODULE='pmwroker.config'
> celery -A pmworker.celery worker -l info

Run Tests
=============
Run all tests:

    python3 run.py

Run specific test file:

    python3 run.py -p test_endpoint

Which is same as:

    python3 run.py -p test_endpoint.py

