Metadata-Version: 2.1
Name: d3m-common-primitives
Version: 2022.2.17
Summary: D3M common primitives
Home-page: https://gitlab.com/datadrivendiscovery/common-primitives
Author: common-primitives
License: Apache-2.0
Platform: UNKNOWN
Classifier: License :: OSI Approved :: Apache Software License
Description-Content-Type: text/markdown
Requires-Dist: d3m (==2021.12.19)
Requires-Dist: datamart-isi (<2.2,>=2.1)
Requires-Dist: datamart-rest (==0.2.7)
Requires-Dist: datamart (==2021.3.17)
Requires-Dist: imageio (<=2.6.0,>=2.3.0)
Requires-Dist: lightgbm (<=2.3.0,>=2.2.2)
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: pillow (==7.1.2)
Requires-Dist: pyprctl (<0.2,>=0.1)
Requires-Dist: scikit-learn
Requires-Dist: shap (<=0.40.0,>=0.37.0)
Requires-Dist: xgboost (<=0.90,>=0.81)
Provides-Extra: opencv
Requires-Dist: opencv-python (<4.5.58,>=4.1) ; extra == 'opencv'
Provides-Extra: opencv-headless
Requires-Dist: opencv-python-headless (<=4.5.4.58,>=4.1) ; extra == 'opencv-headless'

# Common D3M primitives

A common set of primitives for D3M project, maintained together.
It contains example primitives, various glue primitives, and other primitives performers
contributed.

## Installation

This package works on Python 3.6+ and pip 19+.

This package additional dependencies which are specified in primitives' metadata,
but if you are manually installing the package, you have to first run, for Ubuntu:

```
$ apt-get install libopenblas-dev ffmpeg
$ pip3 install python-prctl
```

To install common primitives from inside a cloned repository, run:

```
$ pip3 install -e .
```

When cloning a repository, clone it recursively to get also git submodules:

```
$ git clone --recursive https://gitlab.com/datadrivendiscovery/common-primitives.git
```

## Changelog

See [HISTORY.md](./HISTORY.md) for summary of changes to this package.

## Repository structure

`master` branch contains latest code of common primitives made against the latest stable
release of the [`d3m` core package](https://gitlab.com/datadrivendiscovery/d3m) (its `master` branch).
`devel` branch contains latest code of common primitives made against the
future release of the `d3m` core package (its `devel` branch).

Releases are [tagged](https://gitlab.com/datadrivendiscovery/d3m/tags) but they are not done
regularly. Each primitive has its own versions as well, which are not related to package versions.
Generally is the best to just use the latest code available in `master` or `devel`
branches (depending which version of the core package you are using).

## Testing locally

For each commit to this repository, tests run automatically in the
[GitLab CI](https://gitlab.com/datadrivendiscovery/common-primitives/pipelines). 

If you don't want to wait for the GitLab CI test results and run the tests locally,
you can install and use the [GitLab runner](https://docs.gitlab.com/runner/install/) in your system.

With the local GitLab runner, you can run the tests defined in the [.gitlab-ci.yml](.gitlab-ci.yml)
file of this repository, such as:

```
$ gitlab-runner exec docker style_check
$ gitlab-runner exec docker type_check
```

You can also just try to run tests available under `/tests` by running:

```
$ python3 run_tests.py
```

## Contribute

Feel free to contribute more primitives to this repository. The idea is that we build
a common set of primitives which can help both as an example, but also to have shared
maintenance of some primitives, especially glue primitives.

All primitives are written in Python 3 and are type checked using
[mypy](http://www.mypy-lang.org/), so typing annotations are required.

## About Data Driven Discovery Program

DARPA Data Driven Discovery (D3M) Program is researching ways to get machines to build
machine learning pipelines automatically. It is split into three layers:
TA1 (primitives), TA2 (systems which combine primitives automatically into pipelines
and executes them), and TA3 (end-users interfaces).


