Metadata-Version: 2.1
Name: dbb-ranking-parser
Version: 0.4.2
Summary: Extract league rankings from the DBB (Deutscher Basketball Bund e.V.) website.
Home-page: http://homework.nwsnet.de/releases/4a51/#dbb-ranking-parser
Author: Jochen Kupperschmidt
Author-email: homework@nwsnet.de
License: MIT
Keywords: basketball,rankings,scrape
Platform: any
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Other Audience
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content
Classifier: Topic :: Other/Nonlisted Topic
Requires-Python: >=3.6
Requires-Dist: lxml (>=4.6.2)

DBB Ranking Parser
==================

Extract league rankings from the DBB_ (Deutscher Basketball Bund e.V.)
website.

This library has been extracted from the web application behind the
website of the `BTB Royals Oldenburg`_ (a basketball team from
Oldenburg, Germany) where it has proven itself for many, many years.


Requirements
------------

- Python_ 3.6+
- lxml_


Installation
------------

Install this package via pip_:

.. code:: sh

    $ pip install dbb-ranking-parser

Because of the dependency on lxml_, this will also require the header
files for the targeted Python_ version as well as those for libxml2_ and
libxslt_.

On `Debian Linux`_, one should be able to install these from the
distribution's repositories (as the 'root' user):

.. code:: sh

    # aptitude update
    # aptitude install python3.7-dev libxml2-dev libxslt1-dev

Apart from that (for example, if those packages are not yet installed)
it might be easier to install Debian's pre-built binary packages for
lxml_ instead:

.. code:: sh

    # aptitude update
    # aptitude install python-lxml


Usage
-----

To fetch and parse a league ranking, the appropriate URL is required.

It can be obtained on the DBB_ website. On every league's ranking page
there should be a link to a (non-"XL") HTML print version.

Its target URL should look like this (assuming the league's ID is
12345):
``http://www.basketball-bund.net/public/tabelle.jsp?print=1&viewDescKey=sport.dbb.views.TabellePublicView/index.jsp_&liga_id=12345``

The league ID has to be identified manually in any of the URLs specific
for that league (ranking, schedule, stats).

For convenience, specifying only the league ID is sufficient; the URL
will be assembled automatically. (Obviously, this might break when the
URL structure changes on the DBB website.)


Programmatically
~~~~~~~~~~~~~~~~

.. code:: python

    from dbbrankingparser import load_ranking_for_league


    league_id = 12345

    ranking = list(load_ranking_for_league(league_id))

    top_team = ranking[0]
    print('Top team:', top_team['name'])

The URL can be specified explicitly, too:

.. code:: python

    from dbbrankingparser import load_ranking_from_url


    URL = '<see example above>'

    ranking = list(load_ranking_from_url(URL))

Note that a call to a ``load_ranking_*`` method returns a generator. To
keep its elements around, and also to access them by index, they can be
fed into a list (as shown above).


On the Command Line
~~~~~~~~~~~~~~~~~~~

The package includes a command line script to retrieve a league's
rankings non-programmatically, as JSON. It requires a league ID as its
sole argument:

.. code:: sh

    $ dbb-ranking-parser get 12345
    [{"name": "Team ACME", "rank": 1, …}]


Via HTTP
~~~~~~~~

Also included is an HTTP wrapper around the parser.

To spin up the server:

.. code:: sh

    $ dbb-ranking-parser serve
    Listening for HTTP requests on 127.0.0.1:8080 ...

The server will attempt to look up a ranking for requests with an URL
part of the form ``/<league id>``:

.. code:: sh

    $ curl http://localhost:8080/12345
    [{"name": "Team ACME", "rank": 1, …}]


Docker
------

DBB Ranking Parser can also be run in a Docker_ container. This avoids
the local creation of a virtual environment and the installation of the
packages, or be useful in a deployment where containers are used.

Building a Docker_ image requires:

- Docker_ being installed
- a source copy of the `dbb-ranking-parser` package

In the package path:

.. code:: sh

    $ docker build -t dbb-ranking-parser .

This should build a Docker_ image based upon `Alpine Linux`_ and which
includes Python_ 3, lxml_ and the DBB Ranking Parser itself. It should
be roughly 70 MB in size.

Running the Docker container accepts the same arguments as the command
line script.

To fetch a single ranking:

.. code:: sh

    $ docker run --rm dbb-ranking-parser get 12345
    [{"name": "Team ACME", "rank": 1, …}]

To spin up the HTTP server on port 7000 of the host machine:

.. code:: sh

    $ docker run -p 7000:8080 --rm dbb-ranking-parser serve --host 0.0.0.0 --port 8080

The ``--rm`` option causes a container (but not the image) to be removed
after it exits.


.. _DBB:                  https://www.basketball-bund.net/
.. _BTB Royals Oldenburg: http://www.btbroyals.de/
.. _Python:               https://www.python.org/
.. _pip:                  http://www.pip-installer.org/
.. _lxml:                 https://lxml.de/
.. _libxml2:              http://xmlsoft.org/XSLT/
.. _libxslt:              http://xmlsoft.org/XSLT/
.. _Debian Linux:         https://www.debian.org/
.. _Docker:               https://www.docker.com/
.. _Alpine Linux:         https://alpinelinux.org/


:Copyright: 2006-2021 Jochen Kupperschmidt
:License: MIT, see LICENSE for details.
:Website: http://homework.nwsnet.de/releases/4a51/#dbb-ranking-parser

DBB Ranking Parser Changelog
============================


Version 0.4.2
-------------

Released on February 20, 2021

- Fixed description of how to run the HTTP server in a Docker container.


Version 0.4.1
-------------

Released on February 13, 2021

- Fixed reStructuredText issues in changelog which prevented a release
  on PyPI.


Version 0.4
-----------

Released on February 13, 2021

- Added support for Python 3.6, 3.7, 3.8, and 3.9.
- Dropped support for Python 3.4 and 3.5 (which are end-of-life).
- Updated lxml to at least version 4.6.2.
- Moved package metadata from ``setup.py`` to ``setup.cfg``.
- Switched to a ``src/`` project layout.
- Added type hints (PEP 484).
- Ported tests from ``unittest`` to pytest.
- Merged basic and HTTP server command line interfaces into a single
  argument parser with subcommands ``get`` and ``serve``. Removed
  ``dbb-ranking-server`` entrypoint.
- Renamed command line entrypoint to ``dbb-ranking-parser``.
- Added command line option ``--version`` to show the application's
  version.
- Merged the previous three ``Dockerfile`` files into a single one.
- Updated and simplified Docker image and build process by upgrading
  Alpine Linux to 3.13 and installing lxml as a binary package,
  removing the need for local compilation.


Version 0.3.1
-------------

Released March 10, 2016

- Allowed to specify the HTTP server's host and port on the command
  line.
- Fixed ``Dockerfile`` for the HTTP server to bind it to a public address
  instead of localhost so that exposing the port actually works.


Version 0.3
-----------

Released March 8, 2016

- Added HTTP server that wraps the parser and responds with rankings as
  JSON.
- Added ``Dockerfile`` files for the command line script and the HTTP
  server.


Version 0.2
-----------

Released March 6, 2016

- It is now sufficient to specify just the league ID instead of the full
  URL. The latter is still possible, though.
- Added a command line script to retrieve a league's ranking as JSON.
- Return nothing when parsing irrelevant HTML table rows.
- Return extracted ranks as a generator instead of a list.
- Split code over several modules.


Version 0.1
-----------

Released March 5, 2016

- first official release


