Metadata-Version: 2.1
Name: dataverk
Version: 0.0.15
Summary: NAV Dataverk
Home-page: https://github.com/navikt
Author: NAV IKT
Author-email: paul.bencze@nav.no
License: MIT
Project-URL: Bug Tracker, https://github.com/navikt
Project-URL: Documentation, https://github.com/navikt
Project-URL: Source Code, https://github.com/navikt
Keywords: datapackage datasett etl open-data
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: cryptography (==2.3)
Requires-Dist: requests (==2.21.0)
Requires-Dist: prometheus-client (==0.4.0)
Requires-Dist: SQLAlchemy (==1.2.10)
Requires-Dist: pyjstat (==1.0.1)
Requires-Dist: setuptools (>=39.0.1)
Requires-Dist: pandas (==0.23.3)
Requires-Dist: importlib-resources (==1.0.2)
Requires-Dist: boto3 (==1.9.11)
Requires-Dist: numpy (==1.15.2)
Requires-Dist: fire (==0.1.3)
Requires-Dist: GitPython (==2.1.11)
Requires-Dist: cx-Oracle (==7.0.0)
Requires-Dist: protobuf (==3.6.1)
Requires-Dist: pyarrow (>=0.10.0)
Requires-Dist: pycryptodomex (==3.7.3)
Requires-Dist: python-jenkins (==1.3.0)
Requires-Dist: pyyaml (==4.2b1)
Requires-Dist: elasticsearch (==6.3.0)
Requires-Dist: google-api-core (==0.1.4)
Requires-Dist: google-auth (==1.5.0)
Requires-Dist: google-cloud-core (==0.28.1)
Requires-Dist: google-cloud-storage (==1.10.0)
Requires-Dist: google-resumable-media (==0.3.1)
Requires-Dist: googleapis-common-protos (==1.5.3)

[![CircleCI](https://circleci.com/gh/navikt/dataverk.svg?style=svg&circle-token=3e5fd8de41d8dd24ce2546d0e5800ce06926add0)](https://circleci.com/gh/navikt/dataverk)

# Dataverk 

### Get started

#### Fra scratch - nytt dataverk prosjekt
 1. Opprett repository på github
 2. Klon github repository lokalt på din maskin
 3. ```pip3 install dataverk```
 4. ```dataverk create_settings```
 5. fyll ut den genererte settings.json filen med data

#### Lage en ny datapakke i eksisterende repository
 1. Hvis du ikke har en .env fil kjør; ```dataverk create_env_file```
 2. ```dataverk create```
 3. ```jupyter notebook```
 4. åpne datapakke-navn/scripts/etl.ipynb
 5. Implementer data prosesseringen
 6. push prosjekt endringene til github (```git push orgin master```)




## Metoder for tilgang til datasett. 

### Forbindelser (source & sink med kryptering)
* Fil 
* JsonStat
* Oracle
* Google Cloud Storage
* ...

### Formater
* Pandas
* Arrow
* CSV
* Excel
* JsonStat
* Vega & Vega Lite
* Semiotic
* ...


### Dashboards

## Relaterte  prosjekter

url | beskrivelse
----| -----------
[frictionlessdata.io](https://frictionlessdata.io/) | Frictionless data
[git lfs](https://git-lfs.github.com/) | Github Large File Storage
[dvc.org](https://dvc.org) | Data Science Version Control 
[quilt (github)](https://github.com/quiltdata) | Quilt - Version and deploy data
[Python package in S3](https://github.com/novemberfiveco/s3pypi) | CLI tool for creating a Python Package Repo i S3
[pypiserver based on bottle](https://github.com/pypiserver/pypiserver) | Minimal PyPI server


