Metadata-Version: 2.1
Name: odd-dbt
Version: 0.1.16
Summary: OpenDataDiscovery Action for dbt
License: Apache-2.0
Keywords: Open Data Discovery,dbt,Metadata,Data Discovery,Data Observability
Author: Mateusz Kulas
Author-email: mkulas@provectus.com
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: dbt-core (>=1.4.5,<2.0.0)
Requires-Dist: dbt-postgres (==1.4.5)
Requires-Dist: dbt-redshift (==1.4.0)
Requires-Dist: dbt-snowflake (>=1.4.1,<2.0.0)
Requires-Dist: funcy (>=1.17,<2.0)
Requires-Dist: loguru (>=0.6.0,<0.7.0)
Requires-Dist: odd-models (>=2.0.24,<3.0.0)
Requires-Dist: oddrn-generator (>=0.1.70,<0.2.0)
Requires-Dist: psycopg2-binary (>=2.9.6,<3.0.0)
Requires-Dist: sqlalchemy (>=1.4.46,<2.0.0)
Requires-Dist: typer[all] (>=0.7.0,<0.8.0)
Description-Content-Type: text/markdown

# OpenDataDiscovery dbt tests metadata collecting
[![PyPI version](https://badge.fury.io/py/odd-dbt.svg)](https://badge.fury.io/py/odd-dbt)

CLI tool helps automatically parse and ingest DBT test results to OpenDataDiscovery Platform.
It can be used as separated CLI tool or within [ODD CLI](https://github.com/opendatadiscovery/odd-cli) package which provides some useful additional features.

## Installation
```pip install odd-dbt```

## Command options
```
╭─ Options ─────────────────────────────────────────────────────────────╮
│    --project-dir                 PATH  [default: Path().cwd()odd-dbt] │
│    --target                      TEXT  [default:None]                 │
│    --profile-name                TEXT  [default:None]                 │
│ *  --host    -h                  TEXT  [env var: ODD_PLATFORM_HOST]   │
│ *  --token   -t                  TEXT  [env var: ODD_PLATFORM_TOKEN]  │
│    --dbt-host                    TEXT  [default: localhost]           │
│    --help                              Show this message and exit.    │
╰───────────────────────────────────────────────────────────────────────╯
```


## Command run example
How to create [collector token](https://docs.opendatadiscovery.org/configuration-and-deployment/trylocally#create-collector-entity)?
```bash
odd_dbt_test --host http://localhost:8080 --token <COLLECTOR_TOKEN>
```



## Supported data sources
| Source    |       |
| --------- | ----- |
| Snowflake | 1.4.1 |
| Redshift  | 1.4.0 |
| Postgres  | 1.4.5 |
| MSSQL     |       | 

## Requirements
Library to inject Quality Tests entities requires presence of corresponding with them datasets entities in the platform.
For example: if you want to inject data quality test of Snowflake table, you need to have entity of that table present in the platform.

## Supported tests
Library supports for basics tests provided by dbt.
- `unique`: values in the column should be unique
- `not_null`: values in the column should not contain null values
- `accepted_values`: column should only contain values from list specified in the test config
- `relationships`: each value in the select column of the model exists as a specified field in the reference table (also known as referential integrity)

## ODDRN generation for datasets
`host_settings` of ODDRN generators required for source datasets are loaded from `.dbt/profiles.yml`.

Profiles inside the file looks different for each type of data source.

**Snowflake** host_settings value is created from field `account`. Field value should be `<account_identifier>`
For example the URL for an account uses the following format: `<account_identifier>`.snowflakecomputing.com
Example Snowflake account identifier `hj1234.eu-central-1`.

**Redshift** and **Postgres** host_settings are loaded from field `host` field.

Example Redshift host: `redshift-cluster-example.123456789.eu-central-1.redshift.amazonaws.com`

