Metadata-Version: 2.1
Name: pii-extract-base
Version: 0.7.0
Summary: Extraction of PII from text chunks
Home-page: https://github.com/piisa/pii-extract-base
Download-URL: https://github.com/piisa/pii-extract-base/tarball/v0.7.0
Author: Paulo Villegas
Author-email: paulo.vllgs@gmail.com
License: Apache
Keywords: PIISA, PII
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Development Status :: 4 - Beta
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Provides-Extra: test
License-File: LICENSE

# Pii Extract Base


This repository builds a Python package providing a base library for PII 
detection for Source Documents i.e. extraction of PII (Personally Identifiable
Information aka Personal Data) items existing in the document.

The package itself does **not** implement any PII Detection tasks, it only
provides the base infrastructure for the process. Detection tasks must be
supplied externally.


## Requirements

The package needs
 * at least Python 3.8
 * the pii-data base package
 * one or more pii-extract plugins (to actually do real detection work)

## Usage

The package can be used:
 * As an API, in two flavors: function-based API and object-based API
 * As a command-line tool

For details, see the usage document.


## Building

The provided Makefile can be used to process the package:
 * `make pkg` will build the Python package, creating a file that can be
   installed with `pip`
 * `make unit` will launch all unit tests (using pytest, so pytest must be
   available)
 * `make install` will install the package in a Python virtualenv. The
   virtualenv will be chosen as, in this order:
     - the one defined in the `VENV` environment variable, if it is defined
     - if there is a virtualenv activated in the shell, it will be used
     - otherwise, a default is chosen as `/opt/venv/pii` (it will be
       created if it does not exist)



