Metadata-Version: 1.0
Name: discoursegraphs
Version: 0.1.2
Summary: graph-based processing of multi-level annotated corpora
Home-page: https://github.com/arne-cl/discoursegraphs
Author: Arne Neumann
Author-email: discoursegraphs.programming@arne.cl
License: 3-Clause BSD License
Description: DiscourseGraphs
        ===============
        
        .. image:: http://img.shields.io/pypi/dm/discoursegraphs.svg
           :alt: PyPI download counter
           :align: right
           :target: https://pypi.python.org/pypi/discoursegraphs#downloads
        .. image:: http://img.shields.io/pypi/v/discoursegraphs.svg
           :alt: Latest version
           :align: right
           :target: https://pypi.python.org/pypi/discoursegraphs
        .. image:: http://img.shields.io/badge/license-BSD-yellow.svg
           :alt: BSD License
           :align: right
           :target: http://opensource.org/licenses/BSD-3-Clause
        
        
        This library enables you to process linguistic corpora with multiple levels
        of annotations by:
        
        1. converting the different annotation formats into separate graphs and 
        2. merging these graphs into a single multidigraph (based on the common
           tokenization of the annotation layers)
        
        So far, the following formats can be imported and merged:
        
        * `TigerXML`_ (a format for representing tree-like syntax graphs with
          secondary edges)
        * RS3 (a format used by `RSTTool`_ to
          annotate documents with Rhetorical Structure Theory)
        * an ad-hoc plain text format for annotating expletives (you're probably not
          interested in)
        
        .. _`TigerXML`: http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/TIGERSearch/doc/html/TigerXML.html
        .. _`RSTTool`: http://www.wagsoft.com/RSTTool/
        
        
        Installation
        ------------
        
        Install from PyPI
        ~~~~~~~~~~~~~~~~~
        
        ::
        
            pip install discoursegraphs # prepend 'sudo' if needed
        
        or, if you're oldschool:
        
        ::
        
            easy_install discoursegraphs # prepend 'sudo' if needed
        
        
        Install from source
        ~~~~~~~~~~~~~~~~~~~
        
        ::
        
            git clone https://github.com/arne-cl/discoursegraphs.git
            cd discoursegraphs
            python setup.py install # prepend 'sudo' if needed
        
        
        Usage
        -----
        
        Right now, there's only a primitive command line interface that will
        merge the syntax, RST and expletive annotation layers into one
        graph and generates a dot file from it.
        
        ::
        
            discoursegraphs syntax/doc.xml rst/doc.rs3 expletives/doc.txt doc.dot
            dot -Tpdf doc.dot > discoursegraph.pdf # generates a PDF from the dot file
        
        If you're interested in working with just one of those layers, you'll
        have to call the code directly::
        
            from discoursegraphs import readwrite
            tiger_docgraph = readwrite.TigerDocumentGraph('syntax/doc.xml')
            rst_docgraph = readwrite.RSTGraph('rst/doc.rs3')
            expletives_docgraph = readwrite.AnaphoraDocumentGraph('expletives/doc.txt')
        
        All the document graphs generated in this example are derived from the
        `networkx.MultiDiGraph`_ class, so you should be able to use all of its
        methods.
        
        .. _`networkx.MultiDiGraph`: http://networkx.lanl.gov/reference/classes.multidigraph.html
        
        
        Documentation
        -------------
        
        Source code documentation is available
        `here <https://pythonhosted.org/pypolibox/>`_, but you can always get an
        up-to-date local copy using `Sphinx`_.
        
        You can generate an HTML or PDF version by running these commands in
        the ``docs`` directory::
        
            make latexpdf
        
        to produce a PDF (``docs/_build/latex/discoursegraphs.pdf``) and ::
        
            make html
        
        to produce a set of HTML files (``docs/_build/html/index.html``).
        
        .. _`Sphinx`: http://sphinx-doc.org/
        
        
        Requirements
        ------------
        
        - `lxml <http://lxml.de/>`_
        - `networkx <http://networkx.github.io/>`_
        
        If you'd like to visualize your graphs, you will also need:
        
        - `graphviz <http://graphviz.org/>`_
        - `pygraphviz <http://pygraphviz.github.io/>`_
        
        
        License
        -------
        
        3-Clause BSD.
        
        Author
        ------
        Arne Neumann
        
        
        People who downloaded this also like
        ------------------------------------
        
        - `SaltNPepper`_ (a converter framework for various linguistic data formats)
        
        .. _`SaltNPepper`: https://korpling.german.hu-berlin.de/p/projects/saltnpepper/wiki/
        
        
        .. This is your project NEWS file which will contain the release notes.
        .. Example: http://www.python.org/download/releases/2.6/NEWS.txt
        .. The content of this file, along with README.rst, will appear in your
        .. project's PyPI page.
        
        News
        ====
        
        0.1.2 (2014-05-13)
        ------------------
        
        *Release data: 13-May-2014*
        
        * added basic `Geoff`_ and `Neo4j`_ exporter (not yet available via the command
          line)
        * added sphinx-based documentation
        
        .. _`Geoff`: http://www.neo4j.org/develop/python/geoff
        .. _`Neo4j`: http://www.neo4j.org/
        
        0.1.1 (2014-04-25)
        ------------------
        
        *Release date: 25-Apr-2014*
        
        * small improvements
        * added usage examples to readme
        * discoursegraphs script now uses the commandline interface of the merging module
        
        0.1.0 (2014-04-24)
        ------------------
        
        *Release date: 24-Apr-2014*
        
        * first public release
        * imports: RS3, TigerXML and an ad-hoc format for expletive annotation
        * merge these formats/files into a single multidigraph
        * generates simple dot/graphviz-based visualization
        
        
Keywords: corpus linguistics nlp graph networkx annotation
Platform: UNKNOWN
