Metadata-Version: 2.1
Name: voyagerpy
Version: 0.1.1
Summary: Python library for Voyager, the geo-spatialist R library.
Home-page: https://pmelsted.github.io/voyagerpy
License: BSD-3-Clause
Keywords: Single-Cell,Spatial,voyager
Author: Pall Melsted
Author-email: pmelsted@hi.is
Requires-Python: >=3.8,<4
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Provides-Extra: notebooks
Requires-Dist: anndata (>=0.8)
Requires-Dist: esda (>=2.4.3,<3.0.0)
Requires-Dist: fiona (>=1.9,<2.0)
Requires-Dist: geopandas (>=0.13,<0.14)
Requires-Dist: h5py (>=3.0,<4.0)
Requires-Dist: igraph (>=0.10.4) ; extra == "notebooks"
Requires-Dist: leidenalg (>=0.9.1,<0.10.0) ; extra == "notebooks"
Requires-Dist: libpysal (>=4.7.0,<5.0.0)
Requires-Dist: matplotlib (>=3.6,<3.7)
Requires-Dist: networkx (>=3.0)
Requires-Dist: numpy (>=1.22,<1.24)
Requires-Dist: opencv-python (>=4.7.0.72,<5.0.0.0)
Requires-Dist: pandas (>=1.3,<2.0)
Requires-Dist: scanpy (>=1.9.3,<2.0.0) ; extra == "notebooks"
Requires-Dist: scikit-learn (>=1.2)
Requires-Dist: scipy (>=1.10) ; python_version >= "3.8" and python_version < "3.12"
Requires-Dist: shapely (>=1.7)
Requires-Dist: statsmodels (>=0.13)
Project-URL: Bug Tracker, https://github.com/pmelsted/voyagerpy/issues
Project-URL: Repository, https://github.com/pmelsted/voyagerpy
Description-Content-Type: text/markdown

# VoyagerPy

This repo manages the VoyagerPy Python package, a Python implementation of the R package [Voyager](https://github.com/pachterlab/voyager)

## Installation

To install the latest release of VoyagerPy, you can install it via `pip`:

```pip install voyagerpy```

### Clone the repo
Clone this repo either using SSH:

```git clone git@github.com:pmelsted/voyagerpy.git```

or HTTPS:

```git clone https://github.com/pmelsted/voyagerpy.git```.

To get the bleeding edge version, change your branch to `dev` by running

```git checkout dev```

once inside the `voyagerpy` directory.

### Install using `pip`

To install VoyagerPy, run 

```pip install .```

Some users may experince problems with installing GeoPandas, which VoyagerPy depends on. We refer to the [GeoPandas installation page](https://geopandas.org/en/stable/getting_started.html) if this is the case.

## Structure of VoyagerPy

VoyagerPy uses [AnnData](https://anndata.readthedocs.io/) as its internal datastructure. An AnnData object, `adata`, holds the following attributes:

- `adata.X`: the main data matrix of size $N_{obs} \times N_{vars}$. It holds the count data for each observation and feature (e.g. barcodes x genes), which may have gone under some transformation. Data type may be a `scipy.sparse.csr_matrix`, `numpy.ndarray`, or `numpy.matrix`. This is yet to be set in stone.
- `adata.layers`: A dictionary-like data structure with the values being matrices of the same shape as `adata.X`. These can hold transformations of `adata.X`, such as log-normalized counts.
- `adata.obs`: A `pandas.DataFrame` object where the rows represent the barcodes, and the columns are features of the barcodes.
- `adata.obsp`: This is a dictionary-based object, where each value is a `pandas.DataFrame` of size $N_{obs}\times N_{obs}$, representing a pairwise metric on the observation. For instance, `adata.obsp["distances"]` can hold the pairwise distances between the positions of origin for the barcodes. This can be handy to store graphs over the barcodes.
- `adata.obsm`: This is a dictionary-based object where each value is a `pandas.DataFrame` or `geopandas.GeoDataFrame`. The number of rows in these data frames must be $N_{obs}$. Example of data frames to be stored here:
	- `geometry`: a `geopandas.GeoDataFrame` where each column is of `geopandas.GeoSeries` or `pandas.Series`, used for plotting spatial objects, such as points or polygons. To use GeoPandas for plotting a column, it must be a `geopandas.GeoSeries`. These will represent the geometries of the barcodes.
	- `local_*`: a `pandas.DataFrame` which contains spatial results over features in `obs`. These can be e.g. local Moran's I, local spatial heteroscedasticity (LOSH) over some features `x, y, z`. The columns of `local_moran` and `local_losh` would then be `x, y, z`.
- `adata.var`: A `pandas.DataFrame` object where the rows represent the features from the columns of `X` (e.g. genes), and the columns are features of the genes (or whatever the columns of `X` represent).
- `adata.varp`, `adata.varm`: These are not used for the time being, but these objects can be used similarly to `adata.obsp` and `adata.obsm` but for feature (gene) data.

- `adata.uns`: This is a dictionary containing data that cannot be stored in the above objects:
	- `config`: These can contain config or metadata about this object. By using VoyagerPy to read the scRNA-seq data, this dictionary has the following items by default:
		- `"var_names": "gene_ids"`, meaning that the index of the variables (genes) are the standardized ENSG gene IDs.
		- `"secondary_var_names": "symbol"`, meaning that the column `"symbol"` in `adata.var` contains the symbol names for the genes.
	- `spatial`: This dictionary contains various spatial data, including:
		- `img`: a dictionary with key-values as resolution-image
		- `scale`: a dictionary describing the scales of the images.
		- `transform`: metadata that describes the transforms applied to the images (rotation, mirror) such that the originals can be recovered.
		- `local_results`: This is a dictionary which can contain Monte-Carlo simulations of spatial autocorrelation statistics, such as for local Moran statistics.
