Metadata-Version: 2.1
Name: modern-data-integration-tool
Version: 0.1.0
Summary: A multiheaded modern data bridging package based on pipeline manifests to integrate between any modern (and old) data stack tools
Home-page: https://github.com/Broomva/modern_data_integration_tool
Author: Carlos D. Escobar-Valbuena
Author-email: carlosdavidescobar@gmail.com
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: databricks-session
Provides-Extra: protobuf
Requires-Dist: protobuf (<5.0.0dev) ; extra == 'protobuf'

# Modern Data Integration Tool
**A multiheaded modern data bridging package based on pipeline manifests to integrate between any modern (and old) data stack tools**


## Setup

### Quick Install

```shell
python -m pip install mdit
```

### Build from source

Clone the repository

```shell
git clone https://github.com/Broomva/mdit.git
```

Install the package

``` shell
cd mdit && make install
```

### Build manually

After cloning, create a virtual environment

```shell
conda create -n mdit python=3.10
conda activate mdit
```

Install the requirements

```shell
pip install -r requirements.txt
```

Run the python installation

```shell
python setup.py install
```

## Usage

The deployment requires a .env file created under local folder:

```shell
touch .env
```

It should have a schema like this:

```toml
databricks_experiment_name=''
databricks_experiment_id=''
databricks_host=''
databricks_token=''
databricks_username=''
databricks_password=''
databricks_cluster_id=''
```

```python
import mdit 

# Create a Snowpark session
spark = DatabricksSparkSession().get_session()

# Connect to MLFLow Artifact Server
mlflow_session = DatabricksMLFlowSession().get_session()
```
