Metadata-Version: 2.1
Name: postprocessing-variant-calls
Version: 0.2.2
Summary: This hosts multiple scripts necessary for filtering and processing of variant calls in the vcfs/txt file generated by callers.
Home-page: https://cmo-ci.gitbook.io/postprocessing_variant_calls/
Author: Ronak Shah
Author-email: shahr2@mskcc.org
Requires-Python: >=3.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: PyVCF3
Requires-Dist: pandas
Requires-Dist: typer[all]
Project-URL: Repository, https://github.com/msk-access/postprocessing_variant_calls
Description-Content-Type: text/markdown

# Post-processing of variant calls

This hosts multiple scripts necessary for filtering and processing of variant calls in the vcfs/txt file generated by callers.

## Callers Supported
`pv` is the main command for the `postprocessing_variant_calls` package see `pv --help` to see supported variant callers commands. 

### VarDictJava

The sub-command `pv vardict` allows users to perform post-processing on VarDictJava output. The two supported inputs to `pv vardict` from VarDictJava are `single` and `case-control` vcfs. 

To specify to `pv vardict`, which input type will be used one of the following sub-commands may be used: 
- `pv vardict single` for single sample vcfs 
- `pv vardict case-control` for case-controlled vcfs. 

Next the user can specify, what post-processing should be done. Right now, `postprocessing_variant_calls` supports filtering: 
-  `pv vardict single filter` 
-  `pv vardict case-control filter` 

Finally, we can specify the paths and options for our filtering and run our command. Here is an example using the test data provided in this repository: 

`pv vardict single filter --inputVcf data/Myeloid200-1.vcf  --tsampleName Myeloid200-1  -ad 1 -o data/single`

There are various options and input specifications for filtering so see `pv vardict single filter --help` or `pv vardict single case-sontrol --help` for help. 

See `example_calls.sh` for more example calls. 

### Maf 

maf concat examples: 
- `pv maf concat -f path/to/maf1.maf -f path/to/maf2.maf -o output_maf`
- `pv maf concat -f path/to/maf1.maf -f path/to/maf2.maf -o output_maf -h header.txt`
where `header.txt` is a header file with names by which the mafs will be row-wise concatenated. See `resources/header.txt` for an example.
- `pv maf -p path/to/paths.txt -o output/path/file`
where `path/to/paths.txt` is a txt file with maf path locations. See `resources/paths.txt` for an example. 

maf annotate examples:
- `pv maf mafbybed -m path/to/maf.maf -b path/to/maf.bed -o output/path/file -c annotation`

## How the repo was made

Template used: https://github.com/yxtay/python-project-template

### Usage


#### External dependencies

- [Conda][conda]
- [Docker][docker]
- [Make][make]

#### Create environment

Use Conda to create a virtual environment and activate it for the project.

```bash
conda env create -f environment.yml
conda activate pv_calls
```

#### Install dependencies

Then install project dependencies with Poetry.

```bash
make deps-install
```

#### Updating Environment

To update the environment after initial setup up run: 

```bash
conda env update -f environment.yml
```

instead of `conda create`, and then re-run `make deps-install`

