Metadata-Version: 1.1
Name: spectrify
Version: 3.0.0
Summary: Tools for working with Redshift Spectrum.
Home-page: https://github.com/hellonarrativ/spectrify
Author: The Narrativ Company, Inc.
Author-email: engineering@narrativ.com
License: MIT license
Description: =========
        Spectrify
        =========
        
        
        .. image:: https://img.shields.io/pypi/v/spectrify.svg
            :target: https://pypi.python.org/pypi/spectrify
        
        .. image:: https://img.shields.io/travis/hellonarrativ/spectrify.svg
            :target: https://travis-ci.org/hellonarrativ/spectrify
        
        .. image:: https://readthedocs.org/projects/spectrify/badge/?version=latest
            :target: https://spectrify.readthedocs.io/en/latest/?badge=latest
            :alt: Documentation Status
        
        
        A simple yet powerful tool to move your data from Redshift to Redshift Spectrum.
        
        
        * Free software: MIT license
        * Documentation: https://spectrify.readthedocs.io.
        
        
        Features
        --------
        
        One-liners to:
        
        * Export a Redshift table to S3 (CSV)
        * Convert exported CSVs to Parquet files in parallel
        * Create the Spectrum table on your Redshift cluster
        * **Perform all 3 steps in sequence**, essentially "copying" a Redshift table Spectrum in one command.
        
        S3 credentials are specified using boto3. See http://boto3.readthedocs.io/en/latest/guide/configuration.html
        
        Redshift credentials are supplied via environment variables, command-line parameters, or interactive prompt.
        
        Install
        --------
        
        .. code-block:: bash
        
            $ pip install spectrify
        
        
        Command-line Usage
        ------------------
        
        Export Redshift table `my_table` to a folder of CSV files on S3:
        
        .. code-block:: bash
        
            $ spectrify --host=example-url.redshift.aws.com --user=myuser --db=mydb export my_table \
                's3://example-bucket/my_table'
        
        Convert exported CSVs to Parquet:
        
        .. code-block:: bash
        
            $ spectrify --host=example-url.redshift.aws.com --user=myuser --db=mydb convert my_table \
                's3://example-bucket/my_table'
        
        Create Spectrum table from S3 folder:
        
        .. code-block:: bash
        
            $ spectrify --host=example-url.redshift.aws.com --user=myuser --db=mydb create_table \
                's3://example-bucket/my_table' my_table my_spectrum_table
        
        Transform Redshift table by performing all 3 steps in sequence:
        
        .. code-block:: bash
        
            $ spectrify --host=example-url.redshift.aws.com --user=myuser --db=mydb transform my_table \
                's3://example-bucket/my_table'
        
        
        Python Usage
        ------------
        
        Export to S3:
        
        .. code-block:: python
        
        
            from spectrify.export import RedshiftDataExporter
            RedshiftDataExporter(sa_engine, s3_config).export_to_csv('my_table')
        
        Convert exported CSVs to Parquet:
        
        .. code-block:: python
        
            from spectrify.convert import ConcurrentManifestConverter
            from spectrify.utils.schema import SqlAlchemySchemaReader
            sa_table = SqlAlchemySchemaReader(engine).get_table_schema('my_table')
            ConcurrentManifestConverter(sa_table, s3_config).convert_manifest()
        
        Create Spectrum table from S3 parquet folder:
        
        .. code-block:: python
        
            from spectrify.create import SpectrumTableCreator
            from spectrify.utils.schema import SqlAlchemySchemaReader
            sa_table = SqlAlchemySchemaReader(engine).get_table_schema('my_table')
            SpectrumTableCreator(sa_engine, dest_schema, dest_table_name, sa_table, s3_config).create()
        
        Transform Redshift table by performing all 3 steps in sequence:
        
        .. code-block:: python
        
            from spectrify.transform import TableTransformer
            transformer = TableTransformer(engine, 'my_table', s3_config, dest_schema, dest_table_name)
            transformer.transform()
        
        Contribute
        ----------
        Contributions always welcome! Read our guide on contributing here: http://spectrify.readthedocs.io/en/latest/contributing.html
        
        License
        -------
        MIT License. Copyright (c) 2017, The Narrativ Company, Inc.
        
        
        =======
        History
        =======
        
        3.0.0 (2019-11-26)
        ------------------
        Backwards incompatible changes:
        * Add REGION parameter to UNLOAD operations
        * Bugfix: Correctly construct path for S3 bucket in "create-table" command
        Other Changes:
        * Support for obtaining credentials with AWS session token
        * Upgrade to pytest v4.6.6
        * Fix Flake8 errors
        
        2.0.0 (2019-03-09)
        ------------------
        
        * Default to 256MB files
        * Flag for unicode support on Python 2.7 (performance implications)
        * Drop support for Python 3.4
        * Support for additional CSV format parameters
        * Support for REAL data type
        
        
        1.0.1 (2018-07-12)
        ------------------
        
        * Loosen version requirement for PyArrow
        * Add example script
        * Update documentation
        
        
        1.0.0 (2018-04-20)
        ------------------
        
        * Move functionality into classes to make customizing behavior easier
        * Add support for DATE columns
        * Add support for DECIMAL/NUMERIC columns
        * Upgrade to pyarrow v0.9.0
        
        
        0.4.1 (2018-03-25)
        ------------------
        
        * Fix exception when source table is not in schema public
        
        
        0.4.0 (2018-02-25)
        ------------------
        
        * Upgrade to pyarrow v0.8.0
        * Verify Redshift column types are supported before attempting conversion
        * Bugfix: Properly clean up multiprocessing.pool resource
        
        
        0.3.0 (2017-10-30)
        ------------------
        
        * Support 16- and 32-bit integers
        * Packaging updates
        
        
        0.2.1 (2017-09-27)
        ------------------
        
        * Fix Readme
        
        
        0.2.0 (2017-09-27)
        ------------------
        
        * First release on PyPI.
        
        
        0.1.0 (2017-09-13)
        ------------------
        
        * Didn't even make it to PyPI.
        
Keywords: spectrify
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
