Metadata-Version: 2.1
Name: h5rdmtoolbox
Version: 0.10.0
Summary: Supporting a FAIR Research Data lifecycle using Python and HDF5.
Home-page: https://h5rdmtoolbox.readthedocs.io/en/latest/
Author: Matthias Probst
Author-email: matthias.probst@kit.edu
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 4 - Beta
Classifier: Topic :: Scientific/Engineering
Requires-Python: <3.11,>=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: appdirs>=1.4.4
Requires-Dist: numpy<1.23.0,>=1.20
Requires-Dist: h5py>3.7.0
Requires-Dist: matplotlib>=3.5.2
Requires-Dist: IPython>=7.34.0
Requires-Dist: pyyaml
Requires-Dist: xarray>=2022.3.0
Requires-Dist: pint==0.21.1
Requires-Dist: pint_xarray>=0.2.1
Requires-Dist: regex>=2020.7.9
Requires-Dist: packaging
Requires-Dist: python-forge==18.6.0
Requires-Dist: requests
Requires-Dist: zenodo_search==0.1.0
Requires-Dist: pydantic>=2.3.0
Provides-Extra: mongodb
Requires-Dist: pymongo>=4.2.0; extra == "mongodb"
Provides-Extra: io
Requires-Dist: pco_tools>=1.0.0; extra == "io"
Requires-Dist: opencv-python>=4.5.3.56; extra == "io"
Requires-Dist: pandas>=1.4.3; extra == "io"
Provides-Extra: snt
Requires-Dist: xmltodict; extra == "snt"
Requires-Dist: tabulate>=0.8.10; extra == "snt"
Requires-Dist: python-gitlab; extra == "snt"
Requires-Dist: pypandoc>=1.11; extra == "snt"
Provides-Extra: test
Requires-Dist: pytest>=7.1.2; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Requires-Dist: pylint; extra == "test"
Requires-Dist: pco_tools>=1.0.0; extra == "test"
Requires-Dist: opencv-python>=4.5.3.56; extra == "test"
Requires-Dist: pandas>=1.4.3; extra == "test"
Requires-Dist: xmltodict; extra == "test"
Requires-Dist: tabulate>=0.8.10; extra == "test"
Requires-Dist: python-gitlab; extra == "test"
Requires-Dist: pypandoc>=1.11; extra == "test"
Requires-Dist: pymongo>=4.2.0; extra == "test"
Provides-Extra: docs
Requires-Dist: pco_tools>=1.0.0; extra == "docs"
Requires-Dist: opencv-python>=4.5.3.56; extra == "docs"
Requires-Dist: pandas>=1.4.3; extra == "docs"
Requires-Dist: xmltodict; extra == "docs"
Requires-Dist: tabulate>=0.8.10; extra == "docs"
Requires-Dist: python-gitlab; extra == "docs"
Requires-Dist: pypandoc>=1.11; extra == "docs"
Requires-Dist: pymongo>=4.2.0; extra == "docs"
Requires-Dist: pytest>=7.1.2; extra == "docs"
Requires-Dist: pytest-cov; extra == "docs"
Requires-Dist: pylint; extra == "docs"
Requires-Dist: pco_tools>=1.0.0; extra == "docs"
Requires-Dist: opencv-python>=4.5.3.56; extra == "docs"
Requires-Dist: pandas>=1.4.3; extra == "docs"
Requires-Dist: xmltodict; extra == "docs"
Requires-Dist: tabulate>=0.8.10; extra == "docs"
Requires-Dist: python-gitlab; extra == "docs"
Requires-Dist: pypandoc>=1.11; extra == "docs"
Requires-Dist: pymongo>=4.2.0; extra == "docs"
Requires-Dist: jupyterlab; extra == "docs"
Requires-Dist: Sphinx<5,>=3; extra == "docs"
Requires-Dist: sphinx_book_theme==0.3.3; extra == "docs"
Requires-Dist: sphinx-copybutton; extra == "docs"
Requires-Dist: scikit-image; extra == "docs"
Requires-Dist: scikit-learn; extra == "docs"
Requires-Dist: sphinx-design; extra == "docs"
Requires-Dist: simplejson; extra == "docs"
Requires-Dist: myst-nb; extra == "docs"
Requires-Dist: sphinxcontrib-bibtex; extra == "docs"
Provides-Extra: complete
Requires-Dist: pytest>=7.1.2; extra == "complete"
Requires-Dist: pytest-cov; extra == "complete"
Requires-Dist: pylint; extra == "complete"
Requires-Dist: pco_tools>=1.0.0; extra == "complete"
Requires-Dist: opencv-python>=4.5.3.56; extra == "complete"
Requires-Dist: pandas>=1.4.3; extra == "complete"
Requires-Dist: xmltodict; extra == "complete"
Requires-Dist: tabulate>=0.8.10; extra == "complete"
Requires-Dist: python-gitlab; extra == "complete"
Requires-Dist: pypandoc>=1.11; extra == "complete"
Requires-Dist: pymongo>=4.2.0; extra == "complete"
Requires-Dist: pco_tools>=1.0.0; extra == "complete"
Requires-Dist: opencv-python>=4.5.3.56; extra == "complete"
Requires-Dist: pandas>=1.4.3; extra == "complete"
Requires-Dist: xmltodict; extra == "complete"
Requires-Dist: tabulate>=0.8.10; extra == "complete"
Requires-Dist: python-gitlab; extra == "complete"
Requires-Dist: pypandoc>=1.11; extra == "complete"
Requires-Dist: pymongo>=4.2.0; extra == "complete"
Requires-Dist: pytest>=7.1.2; extra == "complete"
Requires-Dist: pytest-cov; extra == "complete"
Requires-Dist: pylint; extra == "complete"
Requires-Dist: pco_tools>=1.0.0; extra == "complete"
Requires-Dist: opencv-python>=4.5.3.56; extra == "complete"
Requires-Dist: pandas>=1.4.3; extra == "complete"
Requires-Dist: xmltodict; extra == "complete"
Requires-Dist: tabulate>=0.8.10; extra == "complete"
Requires-Dist: python-gitlab; extra == "complete"
Requires-Dist: pypandoc>=1.11; extra == "complete"
Requires-Dist: pymongo>=4.2.0; extra == "complete"
Requires-Dist: jupyterlab; extra == "complete"
Requires-Dist: Sphinx<5,>=3; extra == "complete"
Requires-Dist: sphinx_book_theme==0.3.3; extra == "complete"
Requires-Dist: sphinx-copybutton; extra == "complete"
Requires-Dist: scikit-image; extra == "complete"
Requires-Dist: scikit-learn; extra == "complete"
Requires-Dist: sphinx-design; extra == "complete"
Requires-Dist: simplejson; extra == "complete"
Requires-Dist: myst-nb; extra == "complete"
Requires-Dist: sphinxcontrib-bibtex; extra == "complete"

# HDF5 Research Data Management Toolbox

![Tests](https://github.com/matthiasprobst/h5RDMtoolbox/actions/workflows/tests.yml/badge.svg)
![DOCS](https://codecov.io/gh/matthiasprobst/h5RDMtoolbox/branch/dev/graph/badge.svg)
[![Documentation Status](https://readthedocs.org/projects/h5rdmtoolbox/badge/?version=latest)](https://h5rdmtoolbox.readthedocs.io/en/latest/?badge=latest)
![pyvers](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10-blue)

*Note, that the project is still under development!*

The "HDF5 Research Data Management Toolbox" (h5RDMtoolbox) is a python package supporting everybody who is working with
HDF5 to achieve a sustainable data lifecycle which follows
the [FAIR (Findable, Accessible, Interoperable, Reusable)](https://www.nature.com/articles/sdata201618)
principles. It specifically supports the five main steps of

1. Planning (defining a internal layout for HDF5 a metadata convention for attribute usage)
2. Collecting data (creating HDF5 files or converting to HDF5 files from other sources)
3. Analyzing and processing data (Plotting, deriving data, ...)
4. Sharing data (publishing, archiving, ... e.g. to databases like [mongoDB](https://www.mongodb.com/) or repositories
   like [Zenodo](https://zenodo.org/))
5. Reusing data (Searching data in databases, local file structures or online repositories
   like [Zenodo](https://zenodo.org)).

## Quickstart

A quickstart notebook can be tested by clicking on the following badge:

[![Open Quickstart Notebook](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/matthiasprobst/h5RDMtoolbox/blob/main/docs/colab/quickstart.ipynb)

## Documentation

Please find a comprehensive documentation with many examples [here](h5rdmtoolbox.readthedocs.io/en/latest/) or by click
on the image, which shows the research data lifecycle in the center and the respective toolbox features on the outside:

<a href="https://h5rdmtoolbox.readthedocs.io/en/latest/"><img src="docs/_static/new_icon_with_text.svg" alt="RDM lifecycle" style="widht:600px;"></a>

## Installation

Use python 3.8 or higher (tested until 3.10). If you are a regular user, you can install the package via pip:

    pip install h5RDMtoolbox

### Install from source:

Developers may clone the repository and install the package from source.
Clone the repository first:

    git clone https://github.com/matthiasprobst/h5RDMtoolbox.git

Then, run

    pip install h5RDMtoolbox/

Add `--user` if you do not have root access.

For development installation run

    pip install -e h5RDMtoolbox/

### Dependencies

The core functionality depends on the following packages.
Some of them are for general management others are very
specific to the features of the package:

**General dependencies are ...**

- `numpy>=1.20,<1.23.0`: Scientific computing, handling of arrays
- `matplotlib>=3.5.2`: Plotting
- `appdirs>=1.4.4`: Managing user and application directories
- `packaging`: Version handling
- `IPython>=8.4.0`: Pretty display of data in notebooks
- `regex>=2020.7.9`: Working with regular expressions

**Specific to the package are ...**

- `h5py=3.7.0`: HDF5 file interface
- `xarray>=2022.3.0`: Working with scientific arrays in combination with attributes. Allows carrying metadata from HDF5
  to user
- `pint>=0.19.2`: Allows working with units
- `pint_xarray>=0.2.1`: Working with units for usage with xarray
- `python-forge==18.6.0`: Used to update function signatures when using the [standard attributes](https://h5rdmtoolbox.readthedocs.io/en/latest/conventions/standard_attributes_and_conventions.html)
- `pyyaml`: Reading and writing of yaml files, e.g. metadata definitions (conventions)
- `requests`: Used to download files from the internet or validate URLs, e.g. metadata definitions (conventions)

#### Optional dependencies

To run unit tests or to enable certain features, additional dependencies must be installed.

Install optional dependencies by specifying them in square brackets after the package name, e.g.:

    pip install h5RDMtoolbox[mongodb]

[mongodb]

- `pymongo>=4.2.0`: Database solution for HDF5 files

[io]

- `pco_tools>=1.0.0`: Reading of pco image files
- `opencv-python>=4.5.3.56`: Reading of image files (other than pco)
- `pandas>=1.4.3`: Mainly used for reading csv and pretty printing

[snt]

- `xmltodict`: Reading of xml files
- `tabulate>=0.8.10`: Pretty printing of tables
- `python-gitlab`: Access to gitlab repositories
- `pandoc>=2.3`: Conversion of markdown files to html

## Contribution

Feel free to contribute. Make sure to write `docstrings` to your methods and classes and please write tests and use PEP
8 (https://peps.python.org/pep-0008/)

Please write tests for your code and put them into the `test/` folder. Visit the [README file](./tests/README.md) in the
test-folder for more information.

Pleas also add a jupyter notebook in the `docs/` folder in order to document your code. Please visit
the [README file](./docs/README.md) in the docs-folder for more information on how to compile the documentation.

Please use the **numpy style for the docstrings**:
https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html#example-numpy


