Metadata-Version: 1.1
Name: kraken
Version: 0.2.5
Summary: OCR engine compatible with ocropus
Home-page: UNKNOWN
Author: Benjamin Kiessling
Author-email: mittagessen@l.unchti.me
License: Apache
Description: ┌────────────────────────────┐
         │ Description                │
         └────────────────────────────┘
        
        kraken is a fork of ocropus intended to rectify a number of issues while
        preserving (mostly) functional equivalence. Its main goals are:
        
          • Explicit input/output handling.
          • Clean public API
          • Word bounding boxes in hOCR
          • Tests
          • Removal of runtime dependency on gcc
          • Removal of unused spaghetti code
        
        Some of these have already been realized while others require some more work.
        
         ┌────────────────────────────┐
         │ Installation               │
         └────────────────────────────┘
        
        As of now kraken still requires a working gcc on run-time so make sure to have
        build-essential or your distributions equivalent installed. Because the build
        behavior of pip versions older than 6.1.0 interferes with the scipy build
        process numpy has to be installed before doing the actual install:
        
          # pip install numpy
        
        Install kraken either from pypi:
        
          $ pip install kraken
        
        or by running pip in the git repository:
        
          $ pip install .
        
        Finally you'll have to scrounge up an RNN to do the actual recognition of
        characters. To download ocropus' default RNN and place it in the kraken
        directory for the current user:
        
          $ kraken download
        
         ┌────────────────────────────┐
         │ Quickstart                 │
         └────────────────────────────┘
        
        To binarize a single image using the nlbin algorithm:
        
          $ kraken binarize grey.png bw.png
        
        To segment a binarized image into reading-order sorted lines:
        
          $ kraken segment bw.png lines.txt
        
        To OCR a binarized image using the default RNN and the previously generated
        page segmentation:
        
          $ kraken ocr --lines lines.txt bw.png
        
         ┌────────────────────────────┐
         │ Documentation              │
         └────────────────────────────┘
        
        Have a look at http://mittagessen.github.io/kraken
        
        
Keywords: ocr
ocropus
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python :: 2 :: Only
