Metadata-Version: 1.1
Name: streamcorpus-pipeline
Version: 0.7.17
Summary: Tools for building streamcorpus objects, such as those used in TREC.
Home-page: http://github.com/trec-kba/streamcorpus-pipeline
Author: Diffeo, Inc.
Author-email: support@diffeo.com
License: MIT/X11 license http://opensource.org/licenses/MIT
Description: StreamCorpus Pipeline
        =====================
        
        streamcorpus_pipeline is a document processing pipeline that assembles
        streamcorpus objects from raw data sets.
        
        The streamcorpus_pipeline python module contains tools for processing
        streamcorpus.StreamItem objects stored in Chunks.  It includes
        transform functions for getting clean_html, clean_visible, creating
        labels from hyperlinks to particular sites (e.g. Wikipedia), and
        taggers like LingPipe, Serif, and Factorie, which make Tokens and
        Sentences.
        
        Read more at [streamcorpus.org](http://streamcorpus.org/)
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Topic :: Utilities
Classifier: License :: OSI Approved :: MIT License
