Metadata-Version: 2.1
Name: numpipe
Version: 0.3.3
Summary: Easily run Python functions in parallel and cache the results
Home-page: UNKNOWN
Author: John Parker
Author-email: japarker@uchicago.edu
License: MIT
Keywords: cache pipeline hdf5 parallel cluster
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Operating System :: POSIX
Classifier: Operating System :: MacOS
Classifier: Development Status :: 3 - Alpha
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: h5py
Requires-Dist: termcolor
Requires-Dist: toml
Requires-Dist: matplotlib
Requires-Dist: tqdm
Requires-Dist: colorama
Requires-Dist: python-telegram-bot

# NumPipe
NumPipe is a Python software package that makes long-running tasks easier and faster by executing code in embarrassingly parallel and caching the output to HDF5 files.

## Features
* Combine computation and visualization code into single scripts. Only re-run computations on request
* Use the `yield` statement to return data over time that will be periodically cached to file
* Specify dependencies between cached functions
* Progress bars similar to `tqdm` that work in parallel to show the progress of running tasks
* An optional Telegram Messenger bot that can notify the user of completion and send Matplotlib figures and animations
* Command line arguments to re-run tasks and automatically save Matplotlib figures and animations
* Option to use the Slurm Workload Manager to automatically create and submit an sbatch file (used on many compute clusters)

## Installation
NumPipe can be installed with pip
```shell
pip install numpipe
```

## Examples

### single task
```python
from numpipe import scheduler, pbar
from time import sleep
import matplotlib.pyplot as plt

job = scheduler()

@job.cache
def sim():
    for i in pbar(range(100)):
        sleep(.1)   # long running task...
        yield dict(i=i, x=i**2)

@job.plots
def vis():
    var = job.load(sim)
    plt.plot(var.i, var.x)

if __name__ == '__main__':
    job.run()
```

### parallel tasks
```python
from numpipe import scheduler, pbar
from time import sleep
import matplotlib.pyplot as plt

job = scheduler()

@job.cache
def sim(power):
    for i in pbar(range(100)):
        sleep(.1)   # long running task...
        yield dict(i=i, x=i**power)

@job.plots
def vis():
    for name, var in job.load(sim):
        plt.plot(var.i, var.x)

for i in range(3):
    job.add(sim, power=i)

if __name__ == '__main__':
    job.run()
```
### more examples
See the [examples folder](https://github.com/johnaparker/numpipe/tree/master/examples) for more usage examples

## Command line arguments
```
positional arguments:
  {display,clean,slurm}
    display             display available functions and descriptions
    clean               remove all h5files that are no longer cache functions
    slurm               run on a system with the Slurm Workload Manager

optional arguments:
  -h, --help            show this help message and exit
  -r [RERUN [RERUN ...]], --rerun [RERUN [RERUN ...]]
                        re-run specific cached functions by name
  -f, --force           force over-write any existing cached data
  -d [DELETE [DELETE ...]], --delete [DELETE [DELETE ...]]
                        delete specified cached data
  -e EXCLUDE [EXCLUDE ...], --exclude EXCLUDE [EXCLUDE ...]
                        exclude cached function from being re-run
  --at-end              only run at_end functions
  --no-at-end           don't run at_end functions
  -p [PROCESSES], --processes [PROCESSES]
                        number of processes to use in parallel execution (default: cpu_count)
  -ct CACHE_TIME, --cache_time CACHE_TIME
                        time (in seconds) until data cached data is flushed to file
  --no-deps             do not rerun functions that depend on other reran functions
  --mininterval MININTERVAL
                        time (in seconds) for progress bar mininterval argument
  --notify              send notifications without delay
  --notify-message NOTIFY_MESSAGE
                        send a custom message with other notifications
  --notify-delay NOTIFY_DELAY
                        time (in seconds) before notifications will be sent
  --theme THEME         matplotlib plot theme
  --figures FIGURES [FIGURES ...]
                        which figure numbers to display
  --save [SAVE]         save figures and animations
  --save-format SAVE_FORMAT [SAVE_FORMAT ...]
                        file format for figures
  --save-figs [SAVE_FIGS]
                        save figures
  --save-anims [SAVE_ANIMS]
                        save animations
  --debug               run in debug mode (single process)
```

## License
NumPipe is licensed under the terms of the MIT license.


