Metadata-Version: 2.1
Name: featuremap
Version: 1.0.5
Summary: FeatureMAP
Home-page: https://github.com/YYT1002/FeatureMAP
Author-email: Test <yangyangnwpu@gmail.com>
Maintainer: Yang Yang
Maintainer-email: yangyangnwpu@gmail.com
License: GPL
Keywords: dimensionality reduction,manifold learning,tangent space embedding
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: C
Classifier: Programming Language :: Python
Classifier: Topic :: Software Development
Classifier: Topic :: Scientific/Engineering
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Operating System :: Unix
Classifier: Operating System :: MacOS
Classifier: Programming Language :: Python :: 3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.13
Requires-Dist: scikit-learn>=0.16
Requires-Dist: scipy>=0.19
Requires-Dist: numba>=0.55.0
Requires-Dist: umap-learn>=0.5.1
Provides-Extra: features
Requires-Dist: scanpy; extra == "features"
Requires-Dist: pandas; extra == "features"
Requires-Dist: anndata; extra == "features"
Requires-Dist: matplotlib>=3.5.1; extra == "features"
Provides-Extra: core-transition-state
Requires-Dist: scanpy; extra == "core-transition-state"
Requires-Dist: pandas; extra == "core-transition-state"
Requires-Dist: anndata; extra == "core-transition-state"
Requires-Dist: matplotlib>=3.5.1; extra == "core-transition-state"

![FeatureMAP Illustration](./figures/featureMAP.png)

# FeatureMAP: Feature-preserving Manifold Approximation and Projection

Visualizing single-cell data is essential for understanding cellular heterogeneity and dynamics. **FeatureMAP** enhances this process by introducing **gene projection** and **transition/core states**, providing deeper insights into cellular states. While traditional methods like UMAP and t-SNE effectively capture clustering, they often overlook critical gene-level information. FeatureMAP addresses this limitation by integrating concepts from UMAP and PCA, preserving both clustering structures and gene feature variations within a low-dimensional space.

## Description

FeatureMAP presents a novel approach by enhancing manifold learning with pairwise tangent space embedding, ensuring the retention of crucial cellular data features. It introduces two visualization plots: expression embedding (GEX) and variation embedding (GVA).

Here, we demonstrate its effectiveness using a synthetic dataset from ([BEELINE](https://github.com/Murali-group/Beeline)) based on a bifurcation model. Compared to UMAP, FeatureMAP-GEX better preserves cell density, while FeatureMAP-GVA clearly delineates developmental paths.

<!-- ![Bifurcation Embedding](./figures/bifurcation_embedding.png) -->

   <img src="./figures/bifurcation_embedding.png" alt="Transition and Core States"/>



Besides the two-dimensional visualization, FeatureMAP presents three key concepts:

1. **Gene Projection**: Estimating and projecting gene feature loadings, where arrows indicate the direction and magnitude of gene expression changes.
    ![Gene Projection](./figures/gene_contribution.png)

   
2. **Transition and Core States**: Transition and core states are computationally defined based on cell density, curvature, and betweenness centrality. Transition states are characterized by the lowest cell densities, maximal curvature, and highest betweenness centrality, whereas core states exhibit the highest cell densities, minimal curvature, and lowest betweenness centrality.
    <!-- ![Core and Transition States](./figures/core_trans_states.png) -->

    <img src="./figures/core_trans_states.png" alt="Transition and Core States" width="220" height="200"/>


3. **Differential Gene Variation (DGV) Analysis**: The third concept introduces differential gene variation (**DGV**) analysis, which compares transition and core states to identify genes with significant variability. By quantifying gene variation between dynamic transition states and stable core states, DGV highlights regulatory genes likely driving cell-state transitions and differentiation.  
   
    <img src="./figures/DGV.png" alt="DGV"/>


FeatureMAP, a feature-preserving method, enhances the visualization and interpretation of single-cell data. Through analyses of both synthetic and real scRNA-seq data ([TUTORIAL](https://featuremap.readthedocs.io/en/latest/index.html)), FeatureMAP effectively captures intricate clustering structures and identifies key regulatory genes, offering significant advantages for single-cell data analysis.

## Getting Started

### Dependencies

- Python 3.8 or higher
- Required Python libraries: numpy, scipy, matplotlib, umap-learn, scikit-learn
- Operating System: Any (Windows, macOS, Linux)

### Installation

Install directly using pip:

```bash
pip install featuremap-learn
```

## How to use FeatureMAP

### Data Visualization
To apply FeatureMAP in Python with a data matrix (data), where rows represent cells and columns represent genes, use the following command:
```
from sklearn.decomposition import PCA
import featuremap

data_pca = PCA(n_components=50).fit_transform(data)
data_emb = featuremap.FeatureMAP(output_variation=True).fit_transform(data_pca)

```

#### Parameters:
output_variation: bool (False by default). Decide to generate expression embedding or variation embedding. 

#### Outputs
x_emb: expession embedding to show the clustering

v_emb: variation embedding to show the trajectory


## Documentation
More tutorials are at https://featuremap.readthedocs.io/en/latest/index.html.

## Citation
Our FeatureMAP alogrithm is based on the paper

Yang, Yang, et al. "Interpretable Dimensionality Reduction by Feature Preserving Manifold Approximation and Projection." arXiv preprint arXiv:2211.09321 (2022).

## License
The FeatureMAP package is under BSD-3-Clause license.

