Metadata-Version: 2.1
Name: target-snowflake
Version: 0.0.2
Summary: Singer.io target for loading data into Snowflake
Home-page: https://github.com/datamill-co/target-snowflake
Author: datamill
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3 :: Only
Description-Content-Type: text/markdown
Requires-Dist: singer-python (==5.6.1)
Requires-Dist: singer-target-postgres (==0.1.11)
Requires-Dist: snowflake-connector-python (==1.9.1)
Requires-Dist: target-redshift (==0.0.10)
Provides-Extra: tests
Requires-Dist: chance (==0.110) ; extra == 'tests'
Requires-Dist: Faker (==1.0.8) ; extra == 'tests'
Requires-Dist: pytest (==4.5.0) ; extra == 'tests'

# Target Snowflake

[![CircleCI](https://circleci.com/gh/datamill-co/target-snowflake.svg?style=svg)](https://circleci.com/gh/datamill-co/target-snowflake)

[![PyPI version](https://badge.fury.io/py/target-snowflake.svg)](https://pypi.org/project/target-snowflake/)

[![](https://img.shields.io/librariesio/github/datamill-co/target-snowflake.svg)](https://libraries.io/github/datamill-co/target-snowflake)

A [Singer](https://singer.io/) Snowflake target, for use with Singer streams generated by Singer taps.

## Snowflake Connector

[Docs](https://docs.snowflake.net/manuals/user-guide/python-connector.html)

## Install

```sh
pip install target-snowflake
```

## Usage

1. Follow the
   [Singer.io Best Practices](https://github.com/singer-io/getting-started/blob/master/docs/RUNNING_AND_DEVELOPING.md#running-a-singer-tap-with-a-singer-target)
   for setting up separate `tap` and `target` virtualenvs to avoid version
   conflicts.

1. Create a [config file](#configjson) at
   `~/singer.io/target_snowflake_config.json` with Snowflake connection
   information and target Snowflake schema and warehouse.

   ```json
   {
     "snowflake_account": "https://XXXXX.snowflakecomputing.com",
     "snowflake_username": "myuser",
     "snowflake_password": "1234",
     "snowflake_database": "my_analytics",
     "snowflake_schema": "mytapname",
     "snowflake_warehouse": "dw"
   }
   ```

````

1. Run `target-snowfkajke` against a [Singer](https://singer.io) tap.

 ```bash
 ~/.virtualenvs/tap-something/bin/tap-something \
   | ~/.virtualenvs/target-snowflake/bin/target-snowflake \
     --config ~/singer.io/target_snowflake_config.json >> state.json
````

If you are running windows, the following is equivalent:

```
venvs\tap-exchangeratesapi\Scripts\tap-exchangeratesapi.exe | ^
venvs\target-snowflake\Scripts\target-snowlfake.exe ^
--config target_snowflake_config.json
```

### Config.json

The fields available to be specified in the config file are specified
here.

| Field                       | Type                  | Default    | Details                                                                                                                                                                                                                                                                                                                                   |
| --------------------------- | --------------------- | ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `snowflake_account`         | `["string"]`          | `N/A`      | `ACCOUNT` might require the `region` and `cloud` platform where your account is located, in the form of: `<your_account_name>.<region_id>.<cloud>` (e.g. `xy12345.east-us-2.azure`) [Refer to Snowflake's documentation about Account](https://docs.snowflake.net/manuals/user-guide/connecting.html#your-snowflake-account-name-and-url) |
| `snowflake_username`        | `["string"]`          | `N/A`      |                                                                                                                                                                                                                                                                                                                                           |
| `snowflake_password`        | `["string", "null"]`  | `null`     |                                                                                                                                                                                                                                                                                                                                           |
| `snowflake_database`        | `["string"]`          | `N/A`      |                                                                                                                                                                                                                                                                                                                                           |
| `snowflake_schema`          | `["string", "null"]`  | `"PUBLIC"` |                                                                                                                                                                                                                                                                                                                                           |
| `snowflake_warehouse`       | `["string"]`          | `N/A`      |                                                                                                                                                                                                                                                                                                                                           |
| `invalid_records_detect`    | `["boolean", "null"]` | `true`     | Include `false` in your config to disable crashing on invalid records                                                                                                                                                                                                                                                                     |
| `invalid_records_threshold` | `["integer", "null"]` | `0`        | Include a positive value `n` in your config to allow at most `n` invalid records per stream before giving up.                                                                                                                                                                                                                             |
| `disable_collection`        | `["string", "null"]`  | `false`    | Include `true` in your config to disable [Singer Usage Logging](#usage-logging).                                                                                                                                                                                                                                                          |
| `logging_level`             | `["string", "null"]`  | `"INFO"`   | The level for logging. Set to `DEBUG` to get things like queries executed, timing of those queries, etc. See [Python's Logger Levels](https://docs.python.org/3/library/logging.html#levels) for information about valid values.                                                                                                          |
| `persist_empty_tables`      | `["boolean", "null"]` | `False`    | Whether the Target should create tables which have no records present in Remote.                                                                                                                                                                                                                                                          |
| `state_support`             | `["boolean", "null"]` | `True`     | Whether the Target should emit `STATE` messages to stdout for further consumption. In this mode, which is on by default, STATE messages are buffered in memory until all the records that occurred before them are flushed according to the batch flushing schedule the target is configured with.                                        |
| `target_s3`                 | `["object", "null"]`  | `N/A`      | When included, use `S3` to stage files. See `S3` below                                                                                                                                                                                                                                                                                    |

#### S3 Config.json

| Field                   | Type                 | Default | Details                                                                      |
| ----------------------- | -------------------- | ------- | ---------------------------------------------------------------------------- |
| `aws_access_key_id`     | `["string"]`         | `N/A`   |                                                                              |
| `aws_secret_access_key` | `["string"]`         | `N/A`   |                                                                              |
| `bucket`                | `["string"]`         | `N/A`   | Bucket where staging files should be uploaded to.                            |
| `key_prefix`            | `["string", "null"]` | `""`    | Prefix for staging file uploads to allow for better delineation of tmp files |

## Limitations

- [Snowflake SQL Identifiers](https://docs.snowflake.net/manuals/sql-reference/identifiers-syntax.html):
  - Although Snowflake supports quoted identifiers to have non-alphanumeric values, `target-snowflake` limits
    identifiers to uppercase alphanumerics, and underscores
  - This is done to make querability/useability in Snowflake simpler, so as to not require users to _have_ to use
    sometimes cumbersome quotes to query their data
- Requires a [JSON Schema](https://json-schema.org/) for every stream.
- Only string, string with date-time format, integer, number, boolean,
  object, and array types with or without null are supported. Arrays can
  have any of the other types listed, including objects as types within
  items.
  - Example of JSON Schema types that work
    - `['number']`
    - `['string']`
    - `['string', 'null']`
    - `['string', 'integer']`
    - `['integer', 'number']`
  - Exmaple of JSON Schema types that **DO NOT** work
    - `['any']`
    - `['null']`
- JSON Schema combinations such as `anyOf` and `allOf` are not supported.
- JSON Schema \$ref is partially supported:
  - **_NOTE:_** The following limitations are known to **NOT** fail gracefully
  - Presently you cannot have any circular or recursive `$ref`s
  - `$ref`s must be present within the schema:
    - URI's do not work
    - if the `$ref` is broken, the behaviour is considered unexpected
- Any values which are the `string` `\\N` will be streamed to Snowflake as the literal `null`

## Sponsorship

Target Snowflake is sponsored by Data Mill (Data Mill Services, LLC) [datamill.co](https://datamill.co/).

Data Mill helps organizations utilize modern data infrastructure and data science to power analytics, products, and services.

---

Copyright Data Mill Services, LLC 2018


