Metadata-Version: 2.1
Name: spark-optimizer
Version: 0.1.7
Summary: Optimize AWS EMR spark settings (spark-config-cheatsheet)
Home-page: https://github.com/delijati/spark-optimizer
Author: Josip Delic
Author-email: delijati@gmx.net
License: MIT
Description: 
        # Spark-optimizer
        
        [![Build Status](https://api.travis-ci.org/delijati/spark-optimizer.svg?branch=master)](https://travis-ci.org/delijati/spark-optimizer)
        
        Optimize spark settings (for cluster aka yarn run)
        
        Original source: http://c2fo.io/c2fo/spark/aws/emr/2016/07/06/apache-spark-config-cheatsheet/
        
        ## Usage
        
        Install:
        
            $ virtualenv env
            $ env/bin/pip install spark-optimizer
        
        Dev install:
        
            $ virtualenv env
            $ env/bin/pip install -e .
        
        
        Generate settings for `c4.4xlarge` with `4` nodes:
        
            $ env/bin/spark-optimizer c4.4xlarge 4
            {'spark.default.parallelism': '108',
             'spark.driver.cores': '2',
             'spark.driver.maxResultSize': '3481m',
             'spark.driver.memory': '3481m',
             'spark.driver.memoryOverhead': '614m',
             'spark.executor.cores': '2',
             'spark.executor.instances': '27',
             'spark.executor.memory': '3481m',
             'spark.executor.memoryOverhead': '614m'}
        
        Update instance info:
        
            $ env/bin/python spark_optimizer/emr_update.py
        
        
        # CHANGES
        
        0.1.7 (2019-12-09)
        ------------------
        
        - set `long_description_content_type="text/markdown"`
        
        
        0.1.6 (2019-12-09)
        ------------------
        
        - fix docs
        
        
        0.1.5 (2019-12-09)
        ------------------
        
        - update ``emr_instance.yaml``
        
        
        0.1.4 (2019-03-10)
        ------------------
        
        - add ec2 and emr cost to yaml
        
        
        0.1.3 (2019-03-08)
        ------------------
        
        - add emr cost to yaml
        - export load yaml file
        - make ``memory_overhead_coefficient`` editable
        
        
        0.1.2 (2019-02-20)
        ------------------
        
        - unpin the versions
        - rename cli from _ to -
        
        
        0.1.1 (2018-09-12)
        ------------------
        
        - fix email
        
        
        0.1.0 (2018-09-12)
        
        - initial release
        
Keywords: spark emr aws cluster yarn
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Description-Content-Type: text/markdown
