Metadata-Version: 2.1
Name: chunkyp
Version: 0.0.1
Summary: Ray-based preprocesisng pipeline.
Home-page: https://github.com/neophocion/chunkyp
Author: Neo Phocion
Author-email: neophocion@protonmail.com
License: apache-2.0
Download-URL: https://github.com/neophocion/chunkyp/releases
Project-URL: Repo, https://github.com/neophocion/chunkyp
Description: # chunkyp
        
        A small and concise data preprocessing library inspired by common NLP preprocessing workflows. 
        
        Supports [ray](https://github.com/ray-project/ray).
        
        ## Installation
        chunkyp is available on PyPi.
        ```bash
        pip install chunkyp
        ```
        
        For the dev version you can run the following.
        ```bash
        git clone https://github.com/neophocion/chunkyp
        cd chunkyp
        pip install -e .
        ```
        
        ## Usage
        
        The simplest way to get started is to look at the Jupyter notebooks in [`notebooks/`](https://github.com/neophocion/chunkyp/tree/master/notebooks)
        
        A small example:
        
        ```python
        from chunkyp import 
        
        res = pipe(
            records, # a list, or iterator across, dicts
            p('field', lambda x: x.lower()),
            p('field', lambda x: x.upper(), 'new_field'),
            p(['field1', 'field2'], lambda x,y: len(x.split()) == y, 'new_field2'),
        )
        
        res = list(res)
        res
        ```
        
Keywords: ray,preprocessing,nlp,cleaning,workflow
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Text Processing
Classifier: Topic :: Scientific/Engineering
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.5, <4
Description-Content-Type: text/markdown
Provides-Extra: dev
Provides-Extra: test
