Metadata-Version: 2.1
Name: pyplatform
Version: 0.1.0
Summary: Pyplatform is a data analytics platform architeture built around Google BigQuery in a hybrid cloud environment.
Home-page: https://github.com/mhadi813/pyplatform
Author: Muhammad Hadi
Author-email: mhadi813@gmail.com
License: BSD
Keywords: google bigquery cloud functions storage jupyterlab python SQL
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Office/Business
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Description-Content-Type: text/markdown
Requires-Dist: pandas (>=1.0.0)
Requires-Dist: google-cloud-bigquery (>=1.24.0)
Requires-Dist: google-cloud-storage (>=1.24.0)
Requires-Dist: gcsfs (==0.6.0)
Requires-Dist: azure-storage-blob (==1.5.0)
Requires-Dist: azure-functions (==1.2.0)
Requires-Dist: pyarrow (==0.16.0)
Requires-Dist: requests (==2.23.0)
Requires-Dist: requests-ntlm (==1.1.0)
Requires-Dist: ntlm-auth (==1.4.0)
Requires-Dist: xlrd (==1.2.0)
Requires-Dist: XlsxWriter (==1.2.7)
Requires-Dist: openpyxl (==2.6.2)
Requires-Dist: pyodbc (==4.0.27)
Requires-Dist: sqlalchemy
Requires-Dist: tableauhyperapi (>=0.0.10622)
Requires-Dist: tableauserverclient (>=0.10)
Requires-Dist: pantab (>=1.1.0)

### Pyplatform is a data analytics platform architeture built around Google BigQuery in a hybrid cloud environment.

the platorm:
-  provides fast, scalable and reliable database solution
-  abstracts away the infrastuture by builiding data pipelines with serverless compute solutions in python runtime environments
-  simplifies development environment by using jupyter lab as the main tool

<img align="left" style="width: 1200px;" src="samples/pyplatform image/pyplatform.png">

## Installation


```python
pip install pyplatform
```

## Setting up development environment
```
git clone https://github.com/mhadi813/pyplatform
cd pyplatform
conda env create -f pyplatform_dev.yml
```


### [Environment variables](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#saving-environment-variables)





```python
import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/default_service_account.json'
os.environ['DATASET'] = 'default_bigquery_dataset_name'
os.environ['STORAGE_BUCKET'] = 'default_storage_bucket_id'
```

## Usage
## common data pipeline architectures:

### - Http sources

<img align="left" style="width: 740px;" src="samples/pyplatform image/http_sources.png">

### - On-prem servers

<img align="left" style="width: 740px;" src="samples/pyplatform image/on-prem_sources.png">

### - Bigquery integration with Azure Logic Apps

<img align="left" style="width: 740px;" src="samples/pyplatform image/logic_apps_integration.png">

### - Event driven ETL process

<img align="left" style="width: 740px;" src="samples/pyplatform image/event_driven.png">

### - Streaming pipelines

<img align="left" style="width: 740px;" src="samples/pyplatform image/streaming.png">

## Exploring modules


```python

import pyplatform as pyp
pyp.show_me()

```


