Metadata-Version: 2.1
Name: scouter-learn
Version: 0.1.10
Summary: A transcriptional response predictor for unseen genetic perturbtions with LLM embeddings
Home-page: https://github.com/PancakeZoy/scouter
Author: Ouyang Zhu, Jun Li
Author-email: ozhu@nd.edu
Requires-Python: >=3.10.0
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch >=2.0.0
Requires-Dist: tqdm >=4.0.0
Requires-Dist: anndata >=0.10.0
Requires-Dist: pandas >=2.2.0
Requires-Dist: numpy >=1.26.0
Requires-Dist: scanpy >=1.10.0
Requires-Dist: seaborn >=0.13.0
Requires-Dist: matplotlib >=3.9.0
Requires-Dist: scikit-learn >=1.5.0
Requires-Dist: scipy >=1.14.0
Requires-Dist: requests >=2.0.0

<p align="left">
  <img src="https://github.com/PancakeZoy/scouter/blob/master/img/ScouterLogo.png?raw=true" width="100" title="logo">
</p>

# Scouter: a transcriptional response predictor for unseen genetic perturbtions with LLM embeddings

Scouter is a deep neural network with simple architecture, for the task of predicting transcriptional response to unseen genetic perturbtions.

Scouter employs the LLM embeddings generated from text description of genes, enabling the perdiction on unseen genes.

For more details read our [manuscript]()
<p align="center">
  <img src="https://github.com/PancakeZoy/scouter/blob/master/img/workflow_horizontal.png?raw=true" width="750" title="logo">
</p>
<br>
<p align="center">
  <img src="https://github.com/PancakeZoy/scouter/blob/master/img/scouter_horizontal.png?raw=true" width="750" title="logo">
</p>

## Installation
`pip install scouter-learn`

## Main API
Below is an example that includes main APIs to train `Scouter` on a perturbation dataset. 

```python
from scouter import Scouter, ScouterData, adamson_small, embedding_small

adata = adamson_small()
embd = embedding_small()
scouterdata = ScouterData(adata=adata, embd=embd, key_label='condition', key_var_genename='gene_name')
scouterdata.setup_ad('embd_index')
scouterdata.gene_ranks()
scouterdata.get_dropout_non_zero_genes()
scouterdata.split_Train_Val_Test(seed=1)

# Model Training
scouter_model = Scouter(scouterdata)
scouter_model.model_init()
scouter_model.train()

# Prediction
scouter_model.pred(['ATP5B+ctrl', 'MANF+ctrl'])

# Evaluation
scouter_model.barplot('MANF+ctrl')
```

## Demos

| Name | Description |
|-----------------|-------------|
| [Demo.ipynb](demo/Demo.ipynb) | A detailed tutorial on how to apply `Scouter` to a smaller version of Adamson dataset, including preprocessing, paramter setting, model training, and evaluation|
| [OwnDataTutorial.ipynb](demo/OwnDataTutorial.ipynb)| A tutorial to guide users to load their own dataset and embedding matrix.
| [UnmatchRemedy.ipynb](demo/UnmatchRemedy.ipynb) | A tutorial that illustrates the problem of unmatched genes between perturbation dataset and embedding matrix, and provides a remedy.|
