Metadata-Version: 2.1
Name: scip-workflows
Version: 0.1.2
Home-page: https://github.com/mlippeve/ehv/tree/master/
Author: Maxim Lippeveld
Author-email: maxim.lippeveld@ugent.be
License: Apache Software License 2.0
Keywords: imaging flow cytometry bioinformatics computational biology SCIP
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: matplotlib
Requires-Dist: zarr
Requires-Dist: aicsimageio
Requires-Dist: aicspylibczi
Requires-Dist: scip
Requires-Dist: pyarrow
Requires-Dist: imbalanced-learn
Requires-Dist: seaborn
Requires-Dist: fcsparser
Requires-Dist: FlowUtils (==1.0.0)
Requires-Dist: scipy
Requires-Dist: scikit-learn
Requires-Dist: scikit-image
Requires-Dist: xgboost
Requires-Dist: scanpy
Requires-Dist: anndata
Requires-Dist: shap
Requires-Dist: kneed
Requires-Dist: cellpose
Requires-Dist: tqdm
Requires-Dist: igraph
Requires-Dist: leidenalg
Requires-Dist: umap-learn
Requires-Dist: pandas
Requires-Dist: geopandas
Provides-Extra: dev
Requires-Dist: build ; extra == 'dev'
Requires-Dist: twine ; extra == 'dev'
Requires-Dist: jupyterlab ; extra == 'dev'
Requires-Dist: nbdev (>=2) ; extra == 'dev'
Requires-Dist: clean-py ; extra == 'dev'

# SCIP use case workflows

This repository contains all code to reproduce the use cases presented in
[A scalable, reproducible and open-source pipeline for morphologically profiling image cytometry data](https://www.biorxiv.org/content/10.1101/2022.10.24.512549v1).

It contains
- a package `scip_workflows` created using [nbdev](https://nbdev.fast.ai/),
- and three [Snakemake](https://snakemake.readthedocs.io/en/stable/) workflows.

## Installation

To execute the workflows in this repository you need to [install Snakemake](https://snakemake.readthedocs.io/en/stable/getting_started/installation.html).

The workflows have been tested with Python 3.9.13.

## Usage

Reproducing the use cases is done by executing SCIP to profile the images, and Snakemake workflows to generate downstream analysis results.

The required configurations to run SCIP are in the [scip_configs](scip_configs) directory.

The following commands expect Snakemake to be available. Snakemake can be executed using conda environments or a pre-existing environment containing all required packages.

To reproduce a use-case, open a terminal where you cloned this repository and execute:
```bash
snakemake --configfile config/use_case.yaml --directory root_dir use_case
```
where
- `use_case` is one of `WBC`, `CD7` or `BBBC021`,
- `root_dir` points to where you downloaded the use case files

Make sure to update the config file to your situation; mainly setting the `parts` to the amount of output partitions SCIP generated.

This expects the environment to contain all required dependencies. Add `--use-conda` to let
Snakemake create a conda environment containing all requirements.

### Use case: WBC

Data and features (for SCIP and IDEAS) can be downloaded at the [Bioimage Archive](https://www.ebi.ac.uk/biostudies/bioimages/studies/S-BIAD452)

### Use case: CD7

Data and SCIP features can be downloaded at the [Bioimage Archive](https://www.ebi.ac.uk/biostudies/studies/S-BIAD505)

### Use case: BBBC021

Data can be downloaded at the [Broad Bioimage Benchmark Collection](https://bbbc.broadinstitute.org/BBBC021). Features can be downloaded from [Zenodo](https://doi.org/10.5281/zenodo.7276510). You can download metadata BBBC021_v1_image.csv and BBBC021_v1_moa.csv from the supplementary materials "Data S2" in [[1]](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3884769/).

[1] Ljosa, V., Caie, P. D., Ter Horst, R., Sokolnicki, K. L., Jenkins, E. L., Daya, S., ... & Carpenter, A. E. (2013). Comparison of methods for image-based profiling of cellular morphological responses to small-molecule treatment. Journal of biomolecular screening, 18(10), 1321-1329.

