Metadata-Version: 2.1
Name: PydeequDynamicParser
Version: 0.2
Summary: Python library which makes it possible to use validation rules in pydeequ based on json structures.
Home-page: https://github.com/wesleywilian/pydeequ-dynamic-parser
Author: wesleywilian
License: apache-2.0
Download-URL: https://github.com/wesleywilian/pydeequ-dynamic-parser/archive/v0.2.tar.gz
Keywords: pydeequ,json,data,quality,rules
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Description-Content-Type: text/markdown
License-File: LICENSE


# pydeequ-dynamic-parser

Python library which makes it possible to use validation rules in [pydeequ](https://github.com/awslabs/python-deequ) based on json/dict structures.

# Installing

```shell
pip install PydeequDynamicParser
```

# Usage

```python
# User Dynamic Checks
all_checks = [{"name": "isUnique", "parameters": {"column": "COLUMN_NAME", "hint": "Hint here"}},
              {"name": "satisfies", "parameters": {"columnCondition": "(LENGTH(COLUMN_NAME) = 11 OR LENGTH(COLUMN_NAME) = 14) ", "constraintName": "COLUMN_NAME length validate", "assertion": "lambda x: x == 1.0", "hint": None}},
              {"name": "containsEmail", "parameters": {"column": "COLUMN_NAME", "assertion": None, "hint": None}},
              {"name": "isComplete", "parameters": {"column": "COLUMN_NAME", "hint": None}}]

# PyDeequ constraint dynamic constraint based on "all_checks"
from pydeequ.checks import Check
from pydeequ.checks import CheckLevel
from pydeequ.verification import VerificationSuite
from pydeequ.verification import VerificationResult
import PydeequDynamicParser

check = Check(spark, CheckLevel.Error, "Check Name")
check_instance_parsed = PydeequDynamicParser.Parser(check, all_checks).parse()
checkResult = VerificationSuite(spark).onData(df).addCheck(check_instance_parsed).run()
checkResult_df = VerificationResult.checkResultsAsDataFrame(spark, checkResult)

checkResult_df.toPandas()
```

As we can see the line responsible for executing the parse will translate de user json/dict to PyDeequ Check instance.

```python
import PydeequDynamicParser
check_instance_parsed = PydeequDynamicParser.Parser(check, all_checks).parse()
```

# Currently supported validations
- Constraints
  - isUnique
  - satisfies
  - containsEmail
  - isComplete


