Metadata-Version: 2.1
Name: hstrat
Version: 1.9.2
Summary: hstrat enables phylogenetic inference on distributed digital evolution populations
Author-email: Matthew Andres Moreno <m.more500@gmail.com>
License: MIT license
Project-URL: homepage, https://github.com/mmore500/hstrat
Project-URL: documentation, https://hstrat.readthedocs.io
Project-URL: repository, https://github.com/mmore500/hstrat
Keywords: hstrat
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: AUTHORS.rst
Requires-Dist: alifedata-phyloinformatics-convert (>=0.11.0)
Requires-Dist: anytree (>=2.8.0)
Requires-Dist: biopython (>=1.79)
Requires-Dist: bitarray (>=2.6.2)
Requires-Dist: bitstring (>=3.1.9)
Requires-Dist: dendropy (>=4.5.2)
Requires-Dist: Deprecated (>=1.2.13)
Requires-Dist: iterpop (>=0.3.4)
Requires-Dist: interval-search (>=0.3.1)
Requires-Dist: keyname (>=0.4.1)
Requires-Dist: lru-dict (>=1.1.7)
Requires-Dist: matplotlib (>=3.5.2)
Requires-Dist: mmh3 (>=3.0.0)
Requires-Dist: networkx (>=2.6.3)
Requires-Dist: numpy (>=1.17.2)
Requires-Dist: more-itertools (>=8.13.0)
Requires-Dist: mpmath (>=1.1.0)
Requires-Dist: opytional (>=0.1.0)
Requires-Dist: ordered-set (>=4.1.0)
Requires-Dist: packaging (>=23.0)
Requires-Dist: pandas (<2,>=1.0.0)
Requires-Dist: pandera (>=0.13.4)
Requires-Dist: prettytable (>=3.5.0)
Requires-Dist: python-slugify (>=6.1.2)
Requires-Dist: safe-assert (>=0.2.0)
Requires-Dist: seaborn (>=0.11.0)
Requires-Dist: sortedcontainers (>=2.4.0)
Requires-Dist: statsmodels (>=0.14.0)
Requires-Dist: typing-extensions (>=4.7.1)
Requires-Dist: tqdm (>=4.62.3)
Provides-Extra: docs
Requires-Dist: ipython (==8.12.3) ; extra == 'docs'
Requires-Dist: ipykernel (==6.26.0) ; extra == 'docs'
Requires-Dist: nbsphinx (==0.9.3) ; extra == 'docs'
Requires-Dist: pyyaml (==6.0.1) ; extra == 'docs'
Requires-Dist: Sphinx (==4.4.0) ; extra == 'docs'
Requires-Dist: sphinx-rtd-theme (==1.0.0) ; extra == 'docs'
Provides-Extra: jit
Requires-Dist: numba (>=0.56.4) ; extra == 'jit'
Provides-Extra: pinned_dependencies_py310
Requires-Dist: alifedata-phyloinformatics-convert (==0.11.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: anytree (==2.8.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: biopython (==1.79) ; extra == 'pinned_dependencies_py310'
Requires-Dist: bitarray (==2.6.2) ; extra == 'pinned_dependencies_py310'
Requires-Dist: bitstring (==3.1.9) ; extra == 'pinned_dependencies_py310'
Requires-Dist: dendropy (==4.5.2) ; extra == 'pinned_dependencies_py310'
Requires-Dist: Deprecated (==1.2.13) ; extra == 'pinned_dependencies_py310'
Requires-Dist: iterpop (==0.4.1) ; extra == 'pinned_dependencies_py310'
Requires-Dist: interval-search (==0.3.1) ; extra == 'pinned_dependencies_py310'
Requires-Dist: keyname (==0.4.1) ; extra == 'pinned_dependencies_py310'
Requires-Dist: lru-dict (==1.1.8) ; extra == 'pinned_dependencies_py310'
Requires-Dist: matplotlib (==3.6.3) ; extra == 'pinned_dependencies_py310'
Requires-Dist: mmh3 (==3.0.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: more-itertools (==8.13.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: mpmath (==1.2.1) ; extra == 'pinned_dependencies_py310'
Requires-Dist: networkx (==2.6.3) ; extra == 'pinned_dependencies_py310'
Requires-Dist: numpy (==1.23.5) ; extra == 'pinned_dependencies_py310'
Requires-Dist: opytional (==0.1.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: ordered-set (==4.1.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: packaging (==23.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: pandas (==1.5.3) ; extra == 'pinned_dependencies_py310'
Requires-Dist: pandera (==0.13.4) ; extra == 'pinned_dependencies_py310'
Requires-Dist: prettytable (==3.5.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: python-slugify (==6.1.2) ; extra == 'pinned_dependencies_py310'
Requires-Dist: safe-assert (==0.2.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: seaborn (==0.11.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: sortedcontainers (==2.4.0) ; extra == 'pinned_dependencies_py310'
Requires-Dist: statsmodels (==0.14.1) ; extra == 'pinned_dependencies_py310'
Requires-Dist: tqdm (==4.62.3) ; extra == 'pinned_dependencies_py310'
Provides-Extra: pinned_dependencies_py38
Requires-Dist: alifedata-phyloinformatics-convert (==0.11.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: anytree (==2.8.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: biopython (==1.79) ; extra == 'pinned_dependencies_py38'
Requires-Dist: bitarray (==2.6.2) ; extra == 'pinned_dependencies_py38'
Requires-Dist: bitstring (==3.1.9) ; extra == 'pinned_dependencies_py38'
Requires-Dist: dendropy (==4.5.2) ; extra == 'pinned_dependencies_py38'
Requires-Dist: Deprecated (==1.2.13) ; extra == 'pinned_dependencies_py38'
Requires-Dist: iterpop (==0.4.1) ; extra == 'pinned_dependencies_py38'
Requires-Dist: interval-search (==0.3.1) ; extra == 'pinned_dependencies_py38'
Requires-Dist: keyname (==0.4.1) ; extra == 'pinned_dependencies_py38'
Requires-Dist: lru-dict (==1.1.8) ; extra == 'pinned_dependencies_py38'
Requires-Dist: matplotlib (==3.5.2) ; extra == 'pinned_dependencies_py38'
Requires-Dist: mmh3 (==3.0.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: more-itertools (==8.13.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: mpmath (==1.2.1) ; extra == 'pinned_dependencies_py38'
Requires-Dist: networkx (==2.6.3) ; extra == 'pinned_dependencies_py38'
Requires-Dist: numpy (==1.21.6) ; extra == 'pinned_dependencies_py38'
Requires-Dist: opytional (==0.1.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: ordered-set (==4.1.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: pandas (==1.3.5) ; extra == 'pinned_dependencies_py38'
Requires-Dist: pandera (==0.13.4) ; extra == 'pinned_dependencies_py38'
Requires-Dist: packaging (==23.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: prettytable (==3.5.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: python-slugify (==6.1.2) ; extra == 'pinned_dependencies_py38'
Requires-Dist: safe-assert (==0.2.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: seaborn (==0.11.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: sortedcontainers (==2.4.0) ; extra == 'pinned_dependencies_py38'
Requires-Dist: statsmodels (==0.14.1) ; extra == 'pinned_dependencies_py38'
Requires-Dist: tqdm (==4.62.3) ; extra == 'pinned_dependencies_py38'
Provides-Extra: pinned_jit
Requires-Dist: numba (==0.56.4) ; extra == 'pinned_jit'
Provides-Extra: release
Requires-Dist: bumpver (==2022.1120) ; extra == 'release'
Requires-Dist: twine (==1.14.0) ; extra == 'release'
Requires-Dist: wheel (==0.33.6) ; extra == 'release'
Requires-Dist: pip (==22.0.4) ; extra == 'release'
Requires-Dist: pip-tools (==6.10.0) ; extra == 'release'
Requires-Dist: setuptools (==65.6.3) ; extra == 'release'
Provides-Extra: testing_py310
Requires-Dist: contexttimer (==0.3.3) ; extra == 'testing_py310'
Requires-Dist: coverage (==6.4.4) ; extra == 'testing_py310'
Requires-Dist: flake8 (==3.7.8) ; extra == 'testing_py310'
Requires-Dist: iterify (==0.1.0) ; extra == 'testing_py310'
Requires-Dist: more-itertools (==8.13.0) ; extra == 'testing_py310'
Requires-Dist: mypy (==0.991) ; extra == 'testing_py310'
Requires-Dist: pytest-xdist (==2.5.0) ; extra == 'testing_py310'
Requires-Dist: pytest (==7.1.2) ; extra == 'testing_py310'
Requires-Dist: pyyaml (==5.3.1) ; extra == 'testing_py310'
Requires-Dist: safe-assert (==0.2.0) ; extra == 'testing_py310'
Requires-Dist: setuptools ; extra == 'testing_py310'
Requires-Dist: scipy (==1.9.3) ; extra == 'testing_py310'
Requires-Dist: teeplot (==0.4.0) ; extra == 'testing_py310'
Requires-Dist: tox (==3.24.0) ; extra == 'testing_py310'
Requires-Dist: tqdist (==1.0) ; extra == 'testing_py310'
Provides-Extra: testing_py38
Requires-Dist: contexttimer (==0.3.3) ; extra == 'testing_py38'
Requires-Dist: coverage (==6.4.4) ; extra == 'testing_py38'
Requires-Dist: flake8 (==3.7.8) ; extra == 'testing_py38'
Requires-Dist: iterify (==0.1.0) ; extra == 'testing_py38'
Requires-Dist: more-itertools (==8.13.0) ; extra == 'testing_py38'
Requires-Dist: mypy (==0.991) ; extra == 'testing_py38'
Requires-Dist: pytest-xdist (==2.5.0) ; extra == 'testing_py38'
Requires-Dist: pytest (==7.1.2) ; extra == 'testing_py38'
Requires-Dist: pyyaml (==5.3.1) ; extra == 'testing_py38'
Requires-Dist: safe-assert (==0.2.0) ; extra == 'testing_py38'
Requires-Dist: setuptools ; extra == 'testing_py38'
Requires-Dist: scipy (==1.7.3) ; extra == 'testing_py38'
Requires-Dist: teeplot (==0.4.0) ; extra == 'testing_py38'
Requires-Dist: tox (==3.24.0) ; extra == 'testing_py38'
Requires-Dist: tqdist (==1.0) ; extra == 'testing_py38'

![hstrat wordmark](docs/assets/hstrat-wordmark.png)

[
![PyPi](https://img.shields.io/pypi/v/hstrat.svg)
](https://pypi.python.org/pypi/hstrat)
[
![codecov](https://codecov.io/gh/mmore500/hstrat/branch/master/graph/badge.svg?token=JwMfFOpBBD)
](https://codecov.io/gh/mmore500/hstrat)
[
![Codacy Badge](https://app.codacy.com/project/badge/Grade/9ab14d415aa9458d97b4cf760b95f874)
](https://www.codacy.com/gh/mmore500/hstrat/dashboard)
[
![CI](https://github.com/mmore500/hstrat/actions/workflows/ci.yaml/badge.svg)
](https://github.com/mmore500/hstrat/actions)
[
![Read The Docs](https://readthedocs.org/projects/hstrat/badge/?version=latest)
](https://hstrat.readthedocs.io/en/latest/?badge=latest)
[
![GitHub stars](https://img.shields.io/github/stars/mmore500/hstrat.svg?style=round-square&logo=github&label=Stars&logoColor=white)](https://github.com/mmore500/hstrat)
[
![Zenodo](https://zenodo.org/badge/464531144.svg)
](https://zenodo.org/badge/latestdoi/464531144)
[![JOSS](https://joss.theoj.org/papers/10.21105/joss.04866/status.svg)](https://doi.org/10.21105/joss.04866)

_hstrat_ enables phylogenetic inference on distributed digital evolution populations

- Free software: MIT license
- Documentation: <https://hstrat.readthedocs.io>
- Repository: <https://github.com/mmore500/hstrat>

## Install

`python3 -m pip install hstrat`

## Features

_hstrat_ serves to enable **robust, efficient extraction of evolutionary history** from evolutionary simulations where centralized, direct phylogenetic tracking is not feasible.
Namely, in large-scale, **decentralized parallel/distributed evolutionary simulations**, where agents' evolutionary lineages migrate among many cooperating processors over the course of simulation.

_hstrat_ can

- accurately estimate **time since MRCA** among two or several digital agents, even for uneven branch lengths
- **reconstruct phylogenetic trees** for entire populations of evolving digital agents
- **serialize genome annotations** to/from text and binary formats
- provide **low-footprint** genome annotations (e.g., reasonably as low as **64 bits** each)
- be directly configured to satisfy **memory use limits** and/or **inference accuracy requirements**

_hstrat operates just as well in single-processor simulation, but direct phylogenetic tracking using a tool like [phylotrackpy](https://github.com/emilydolson/phylotrackpy/) should usually be preferred in such cases due to its capability for perfect record-keeping given centralized global simulation observability._

## Example Usage

This code briefly demonstrates,

1.  initialization of a population of `HereditaryStratigraphicColumn` of objects,
2.  generation-to-generation transmission of `HereditaryStratigraphicColumn` objects with simple synchronous turnover, and then
3.  reconstruction of phylogenetic history from the final population of `HereditaryStratigraphicColumn` objects.

```python3
from random import choice as rchoice
import alifedata_phyloinformatics_convert as apc
from hstrat import hstrat; print(f"{hstrat.__version__=}")  # when last ran?
from hstrat._auxiliary_lib import seed_random; seed_random(1)  # reproducibility

# initialize a small population of hstrat instrumentation
# (in full simulations, each column would be attached to an individual genome)
population = [hstrat.HereditaryStratigraphicColumn() for __ in range(5)]

# evolve population for 40 generations under drift
for _generation in range(40):
    population = [rchoice(population).CloneDescendant() for __ in population]

# reconstruct estimate of phylogenetic history
alifestd_df = hstrat.build_tree(population, version_pin=hstrat.__version__)
tree_ascii = apc.RosettaTree(alifestd_df).as_dendropy.as_ascii_plot(width=20)
print(tree_ascii)
```

```
hstrat.__version__='1.8.8'
              /--- 1
          /---+
       /--+   \--- 3
       |  |
   /---+  \------- 2
   |   |
+--+   \---------- 0
   |
   \-------------- 4
```

In [actual usage](https://hstrat.readthedocs.io/en/latest/demo-ping.html), each _hstrat_ column would be bundled with underlying genetic material of interest in the simulation --- entire genomes or, in systems with sexual recombination, individual genes.
The _hstrat_ columns are designed to operate as a neutral genetic annotation, enhancing observability of the simulation but not affecting its outcome.

## How it Works

In order to enable phylogenetic inference over fully-distributed evolutionary simulation, hereditary stratigraphy adopts a paradigm akin to phylogenetic work in natural history/biology.
In these fields, phylogenetic history is inferred through comparisons among genetic material of extant organisms, with --- in broad terms --- phylogenetic relatedness established through the extent of genetic similarity between organisms.
Phylogenetic tracking through _hstrat_, similarly, is achieved through analysis of similarity/dissimilarity among genetic material sampled over populations of interest.

Rather than random mutation as with natural genetic material, however, genetic material used by _hstrat_ is structured through _hereditary stratigraphy_.
This methodology, described fully in our documentation, provides strong guarantees on phylogenetic inferential power, minimizes memory footprint, and allows efficient reconstruction procedures.

See [here](https://hstrat.readthedocs.io/en/latest/mechanism.html) for more detail on underlying hereditary stratigraphy methodology.

## Getting Started

Refer to our documentation for a [quickstart guide](https://hstrat.readthedocs.io/en/latest/quickstart.html) and an [annotated end-to-end usage example](https://hstrat.readthedocs.io/en/latest/demo-ping.html).

The `examples/` folder provides extensive usage examples, including

- incorporation of hstrat annotations into a custom genome class,
- automatic stratum retention policy parameterization,
- pairwise and population-level phylogenetic inference, and
- phylogenetic tree reconstruction.

Interested users can find an explanation of how hereditary stratigraphy methodology implemented by _hstrat_ works "under the hood," information on project-specific _hstrat_ configuration, and full API listing for the _hstrat_ package in [the documentation](https://hstrat.readthedocs.io/).

## Citing

If _hstrat_ software or hereditary stratigraphy methodology contributes to a scholarly work, please cite it according to references provided [here](https://hstrat.readthedocs.io/en/latest/citing.html).
We would love to list your project using _hstrat_ in our documentation, see more [here](https://hstrat.readthedocs.io/en/latest/projects.html).

## Credits

This package was created with Cookiecutter and the `audreyr/cookiecutter-pypackage` project template.

## hcat

![hcat](docs/assets/hcat-banner.png)
