Metadata-Version: 2.1
Name: rhg-compute-tools
Version: 0.0.0
Summary: Tools for using compute.rhg.com and compute.impactlab.org
Home-page: https://github.com/RhodiumGroup/rhg_compute_tools
Author: Michael Delgado
Author-email: mdelgado@rhg.com
License: MIT license
Keywords: rhg_compute_tools
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Description-Content-Type: text/x-rst
Requires-Dist: google-cloud-storage
Requires-Dist: click
Requires-Dist: dask-gateway
Requires-Dist: pandas
Requires-Dist: xarray
Requires-Dist: matplotlib
Requires-Dist: numpy
Requires-Dist: bottleneck

=================
RHG Compute Tools
=================


.. image:: https://img.shields.io/pypi/v/rhg_compute_tools.svg
        :target: https://pypi.python.org/pypi/rhg_compute_tools

.. image:: https://github.com/RhodiumGroup/rhg_compute_tools/workflows/Python%20package/badge.svg

.. image:: https://readthedocs.org/projects/rhg-compute-tools/badge/?version=latest
        :target: https://rhg-compute-tools.readthedocs.io/en/latest/?badge=latest
        :alt: Documentation Status

Tools for using compute.rhg.com and compute.impactlab.org


* Free software: MIT license
* Documentation: https://rhg-compute-tools.readthedocs.io.

Installation
------------

pip:

.. code-block:: bash

    pip install rhg_compute_tools



Features
--------

Kubernetes tools
~~~~~~~~~~~~~~~~

* easily spin up a preconfigured cluster with ``get_cluster()``, or flavors with ``get_micro_cluster()``, ``get_standard_cluster()``, ``get_big_cluster()``, or ``get_giant_cluster()``.

.. code-block:: python

    >>> import rhg_compute_tools.kubernetes as rhgk
    >>> cluster, client = rhgk.get_cluster()

Google cloud storage utilities
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* Utilities for managing google cloud storage directories in parallel from the command line or via a python API

.. code-block:: python

   >>> import rhg_compute_tools.gcs as gcs
   >>> gcs.sync_gcs('my_data_dir', 'gs://my-bucket/my_data_dir')



History
=======

.. current developments

v0.2.3
------
* Make the gsutil API consistent, so that we have `cp`, `sync` and `rm`, each of which
  accept the same args and kwargs 
* Swap ``bumpversion`` for ``setuptools_scm`` to handle versioning 
* Cast coordinates to dict before gathering in ``rhg_compute_tools.xarray.dataarrays_from_delayed`` and ``rhg_compute_tools.xarray.datasets_from_delayed``. This avoids a mysterious memory explosion on the local machine. Also add ``name`` in the metadata used by those functions so that the name of each dataarray or Variable is preserved. 
* Use ``dask-gateway`` when available when creating a cluster in ``rhg_compute_tools.kubernetes``. Add some tests using a local gateway cluster. TODO: More tests.

v0.2.2
------
* ?

v0.2.1
------
* Add remote scheduler deployment (part of dask_kubernetes 0.10)
* Remove extraneous `GCSFUSE_TOKENS` env var no longer used in new worker images
* Set library thread limits based on how many cpus are available for a single dask thread
* Change formatting of the extra `env_items` passed to `get_cluster` to be a list rather than a list of dict-like name/value pairs

v0.2.0
------

* Add CLI tools . See ``rctools gcs repdirstruc --help`` to start
* Add new function ``rhg_compute_tools.gcs.replicate_directory_structure_on_gcs`` to copy directory trees into GCS. Users can authenticate with cred_file or with default google credentials 
* Fixes to docstrings and metadata  
* Add new function ``rhg_compute_tools.gcs.rm`` to remove files/directories on GCS using the ``google.cloud.storage`` API
* Store one additional environment variable when passing ``cred_path`` to ``rhg_compute_tools.kubernetes.get_cluster`` so that the ``google.cloud.storage`` API will be authenticated in addition to ``gsutil``

v0.1.8
------

* Deployment fixes

v0.1.7
------

* Design tools: use RHG & CIL colors & styles
* Plotting helpers: generate cmaps with consistent colors & norms, and apply a colorbar to geopandas plots with nonlinear norms
* Autoscaling fix for kubecluster: switch to dask_kubernetes.KubeCluster to allow use of recent bug fixes


v0.1.6
------

* Add ``rhg_compute_tools.gcs.cp_gcs`` and ``rhg_compute_tools.gcs.sync_gcs`` utilities

v0.1.5
------

* need to figure out how to use this rever thing

v0.1.4
------

* Bug fix again in ``rhg_compute_tools.kubernetes.get_worker``


v0.1.3
------

* Bug fix in ``rhg_compute_tools.kubernetes.get_worker``


v0.1.2
------

* Add xarray from delayed methods in ``rhg_compute_tools.xarray`` 
* ``rhg_compute_tools.gcs.cp_to_gcs`` now calls ``gsutil`` in a subprocess instead of ``google.storage`` operations. This dramatically improves performance when transferring large numbers of small files 
* Additional cluster creation helpers 

v0.1.1
------

* New google compute helpers (see ``rhg_compute_tools.gcs.cp_to_gcs``, ``rhg_compute_tools.gcs.get_bucket``)
* New cluster creation helper (see ``rhg_compute_tools.kubernetes.get_worker``)
* Dask client.map helpers (see ``rhg_compute_tools.utils submodule``)

v0.1.0
------

* First release on PyPI.


