Metadata-Version: 2.1
Name: rfmix-reader
Version: 0.1.10a0
Summary: RFMix-reader is a Python package designed to efficiently read and process output files generated by RFMix, a popular tool for estimating local ancestry in admixed populations. The package employs a lazy loading approach, which minimizes memory consumption by reading only the loci that are accessed by the user, rather than loading the entire dataset into memory at once.
License: GPL-3.0
Author: Kynon JM Benjamin
Author-email: heartgen.lab@gmail.com
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: dask (>=2024.5,<2025.0)
Requires-Dist: numpy (>=1.26,<2.0)
Requires-Dist: pandas (>=2.0,<3.0)
Requires-Dist: tqdm (>=4.66,<5.0)
Description-Content-Type: text/markdown

# rfmix-reader
`rfmix-reader` is a Python package designed to efficiently read and process output 
files generated by RFMix, a popular tool for estimating local ancestry in admixed 
populations. The package employs a lazy loading approach, which minimizes memory 
consumption by reading only the loci that are accessed by the user, rather than 
loading the entire dataset into memory at once. Additionally, we leverage GPU
acceleration to improve computational speed.

## Install
`rfmix-reader` can be installed using [pip](https://pypi.python.org/pypi/pip):

```bash
pip install rfmix-reader
```

**GPU Acceleration:**
`rfmix-reader` leverages GPU acceleration for improved performance. To use this
functionality, you will need to install the following libraries for your specific
CUDA version:
- `RAPIDS`: Refer to official installation guide [here](https://docs.rapids.ai/install)
- `PyTorch`: Installation instructions can be found [here](https://pytorch.org/)

**Additoinal Notes:** 
- We have not tested installation with `Docker` or `Conda` environemnts. Compatibility may vary.
- If you do not have GPU, you can still use the basic functionality of `rfmix-reader`.


## Key Features

**Lazy Loading**
- Reads data on-the-fly as requested, reducing memory footprint.
- Ideal for working with large RFMix output files that may not fit entirely in memory.

**Efficient Data Access**
- Provides convenient access to specific loci or regions of interest.
- Allows for selective loading of data, enabling faster processing times.

**Seamless Integration**
- Designed to work seamlessly with existing Python data analysis workflows.
- Facilitates downstream analysis and manipulation of RFMix output data.

Whether you're working with large-scale genomic datasets or have limited 
computational resources, RFMix-reader offers an efficient and memory-conscious 
solution for reading and processing RFMix output files. Its lazy loading approach 
ensures optimal resource utilization, making it a valuable tool for researchers 
and bioinformaticians working with admixed population data.

## Usage
This works similarly to [`pandas-plink`]():

### Two population admixture example
```python
from rfmix_reader import read_rfmix

file_path = "examples/two_popuations/out/"
loci, rf_q, admix = read_rfmix(file_path)
```

### Three population admixture example

## Authors
* [Kynon JM Benjamin](https://github.com/Krotosbenjamin)

## Citation

Please cite: XXXX.

