Metadata-Version: 2.1
Name: docop-tasks-restricted
Version: 0.3.3
Summary: Tasks for docop that have more restrictive open source licensing
Author-email: Petri Savolainen <petri@koodaamo.fi>
License: GPLv3+
Project-URL: Homepage, https://github.com/koodaamo/docop-tasks-restricted
Project-URL: Repository, https://github.com/koodaamo/docop-tasks-restricted
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: trafilatura>=1.6.4

This package contains packaged tasks for Docop.

The tasks use code that has fairly restrictive licensing for commercial use.

In particular, this package may contain code licensed under AGPL.

Be sure to respect the licenses.

For packaged tasks with more permissive licenses, see: https://github.com/koodaamo/docop-tasks-restricted

## html2text task

Extract plain text content of the HTML string.

Expects HTML text in the `html` field of document.
Output document will have following fields set:
  - `text` field containing the plain text content of the HTML
  - `fingerprint` generated from the text field for e.g. detecting changes
  - `modified` field indicating when text was changed, based on fingerprint, in HTTP Last-Modified format
