Metadata-Version: 1.1
Name: hepcrawl
Version: 0.1.0
Summary: Scrapy project for feeds into INSPIRE-HEP (http://inspirehep.net).
Home-page: https://github.com/inspirehep/hepcrawl
Author: CERN
Author-email: admin@inspirehep.net
License: UNKNOWN
Description: ..
            This file is part of hepcrawl.
            Copyright (C) 2015 CERN.
        
            hepcrawl is a free software; you can redistribute it and/or modify it
            under the terms of the Revised BSD License; see LICENSE file for
            more details.
        
        
        ==========
         HEPcrawl
        ==========
        
        HEPcrawl is a harvesting library based on Scrapy (http://scrapy.org) for INSPIRE-HEP
        (http://inspirehep.net) that focuses on automatic and semi-automatic retrieval of
        new content from all the sources the site aggregates. In particular content from
        major and minor publishers in the field of High-Energy Physics.
        
        The project is currently in early stage of development.
        
        Installation for developers
        ===========================
        
        We start by creating a virtual environment for our Python packages:
        
        .. code-block:: shell
        
            mkvirtualenv hepcrawl
            cdvirtualenv
            mkdir src && cd src
        
        
        Now we grab the code and install it in development mode:
        
        .. code-block:: shell
        
            git clone https://github.com/inspirehep/hepcrawl.git
            cd hepcrawl
            pip install -e .
        
        
        Development mode ensures that any changes you do to your sources are automatically
        taken into account = no need to install again after changing something.
        
        Finally run the tests to make sure all is setup correctly:
        
        .. code-block:: shell
        
            python setup.py test
        
        
        Run example crawler
        ===================
        
        Thanks to the command line tools provided by Scrapy, we can easily test the
        spiders as we are developing them. Here is an example using the simple sample
        spider:
        
        .. code-block:: console
        
            cdvirtualenv src/hepcrawl
            scrapy crawl Sample -a source_file=file://`pwd`/tests/responses/world_scientific/sample_ws_record.xml
        
        
        Thanks for contributing!
        
        
        ..
            This file is part of hepcrawl.
            Copyright (C) 2015 CERN.
        
            hepcrawl is a free software; you can redistribute it and/or modify it
            under the terms of the Revised BSD License; see LICENSE file for
            more details.
        
        
        Changes
        =======
        
        Version 0.1.0 (2015-10-26)
        
        - Initial commit
        
Platform: any
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.6
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
