Metadata-Version: 2.1
Name: data-extractor
Version: 0.6.0.dev1
Summary: Combine XPath, CSS Selectors and JSONPath for Web data extracting.
Home-page: https://github.com/linw1995/data_extractor
License: MIT
Author: linw1995
Author-email: linw1995@icloud.com
Requires-Python: >=3.7,<4.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Provides-Extra: docs
Provides-Extra: jsonpath-extractor
Provides-Extra: jsonpath-rw
Provides-Extra: jsonpath-rw-ext
Provides-Extra: lint
Provides-Extra: test
Requires-Dist: black (>=19.3b0,<20.0); extra == "lint"
Requires-Dist: blacken-docs (>=1.3,<2.0); extra == "lint"
Requires-Dist: cssselect (>=1.0.3,<2.0.0)
Requires-Dist: doc8 (>=0.8.0,<0.9.0); extra == "lint"
Requires-Dist: flake8 (>=3.7.8,<4.0.0); extra == "lint"
Requires-Dist: flake8-bugbear (>=19.8,<20.0); extra == "lint"
Requires-Dist: isort (>=4.3.21,<5.0.0); extra == "lint"
Requires-Dist: jsonpath-extractor (>=0.1.1,<0.2.0); extra == "lint" or extra == "docs" or extra == "jsonpath-extractor"
Requires-Dist: jsonpath-rw (>=1.4.0,<2.0.0); extra == "lint" or extra == "docs" or extra == "jsonpath-rw" or extra == "jsonpath-rw-ext"
Requires-Dist: jsonpath-rw-ext (>=1.2,<2.0); extra == "lint" or extra == "docs" or extra == "jsonpath-rw-ext"
Requires-Dist: lxml (>=4.3.0,<5.0.0)
Requires-Dist: mypy (>=0.730,<0.731); extra == "lint"
Requires-Dist: pygments (>=2.4,<3.0); extra == "lint"
Requires-Dist: pytest (>=5.2.0,<6.0.0); extra == "lint" or extra == "test"
Requires-Dist: pytest-cov (>=2.7.1,<3.0.0); extra == "test"
Requires-Dist: sphinx (>=2.2,<3.0); extra == "docs"
Project-URL: Documentation, https://data-extractor.readthedocs.io/en/latest/
Project-URL: Repository, https://github.com/linw1995/data_extractor
Description-Content-Type: text/x-rst

==============
Data Extractor
==============

|license| |Pypi Status| |Python version| |Package version| |PyPI - Downloads|
|GitHub last commit| |Code style: black| |Build Status| |codecov|
|Documentation Status|

Combine **XPath**, **CSS Selectors** and **JSONPath** for Web data extracting.

Quickstarts
<<<<<<<<<<<

Installation
~~~~~~~~~~~~

Install the stable version from PYPI.

.. code-block:: shell

    pip install data-extractor

Or install the latest version from Github.

.. code-block:: shell

    pip install git+https://github.com/linw1995/data_extractor.git@master

Usage
~~~~~

.. code-block:: python3

    from data_extractor import Field, Item, JSONExtractor


    class Count(Item):
        followings = Field(JSONExtractor("countFollowings"))
        fans = Field(JSONExtractor("countFans"))


    class User(Item):
        name_ = Field(JSONExtractor("name"), name="name")
        age = Field(JSONExtractor("age"), default=17)
        count = Count()


    assert User(JSONExtractor("data.users[*]"), is_many=True).extract(
        {
            "data": {
                "users": [
                    {
                        "name": "john",
                        "age": 19,
                        "countFollowings": 14,
                        "countFans": 212,
                    },
                    {
                        "name": "jack",
                        "description": "",
                        "countFollowings": 54,
                        "countFans": 312,
                    },
                ]
            }
        }
    ) == [
        {"name": "john", "age": 19, "count": {"followings": 14, "fans": 212}},
        {"name": "jack", "age": 17, "count": {"followings": 54, "fans": 312}},
    ]

Changelog
<<<<<<<<<

v0.6.0.dev1
~~~~~~~~~~~

- 2459f7d Dev,New:Add Github Actions for CI
- a151a91 Dev,New:Add scripts/export_requirements_txt.sh
- f7cdaa3 Dev,Chg:Remove travis-ci
- f1d21fe New:Make different implementations of JSONExtractor optional
- 9f74619 Fix:Use __getattr__ on the module in the wrong way
- 25a8bf8 Dev,Fix:Cannot use pytest.mark.usefixtures() in pytest.param
- 8f51603 Dev,Chg:Upgrade poetry version in Makefile
- 21aa08e Dev,Chg:Test in two ways
- 4cb4678 Chg:Upgrade dependencies
- 4177b98 Dev,Fix:remove the venv before pretest installation
- 0175cde New:Add jsonpath-extractor as opitional json extractor backend


.. |license| image:: https://img.shields.io/github/license/linw1995/data_extractor.svg
    :target: https://github.com/linw1995/data_extractor/blob/master/LICENSE

.. |Pypi Status| image:: https://img.shields.io/pypi/status/data_extractor.svg
    :target: https://pypi.org/project/data_extractor

.. |Python version| image:: https://img.shields.io/pypi/pyversions/data_extractor.svg
    :target: https://pypi.org/project/data_extractor

.. |Package version| image:: https://img.shields.io/pypi/v/data_extractor.svg
    :target: https://pypi.org/project/data_extractor

.. |PyPI - Downloads| image:: https://img.shields.io/pypi/dm/data-extractor.svg
    :target: https://pypi.org/project/data_extractor

.. |GitHub last commit| image:: https://img.shields.io/github/last-commit/linw1995/data_extractor.svg
    :target: https://github.com/linw1995/data_extractor

.. |Code style: black| image:: https://img.shields.io/badge/code%20style-black-000000.svg
    :target: https://github.com/ambv/black

.. |Build Status| image:: https://img.shields.io/github/workflow/status/linw1995/data_extractor/Python%20package
    :target: https://github.com/linw1995/data_extractor/actions?query=workflow%3A%22Python+package%22

.. |codecov| image:: https://codecov.io/gh/linw1995/data_extractor/branch/master/graph/badge.svg
    :target: https://codecov.io/gh/linw1995/data_extractor

.. |Documentation Status| image:: https://readthedocs.org/projects/data-extractor/badge/?version=latest
    :target: https://data-extractor.readthedocs.io/en/latest/?badge=latest

