Metadata-Version: 2.1
Name: cocomltools
Version: 0.1.6.dev2
Summary: 
Author: Ghallabi
Author-email: ghallabi.farouk@gmail.com
Requires-Python: >=3.9.8,<4.0.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: numpy (>=1.26.0,<2.0.0)
Requires-Dist: pillow (>=10.4.0,<11.0.0)
Requires-Dist: pydantic (==2.8.2)
Requires-Dist: pydantic_core (==2.20.1)
Requires-Dist: rich (>=13.7.1,<14.0.0)
Requires-Dist: scikit-learn (>=1.5.1,<2.0.0)
Requires-Dist: scikit-multilearn (>=0.2.0,<0.3.0)
Description-Content-Type: text/markdown

# COCO ML Toolbox

COCO ML Toolbox is a command-line interface (CLI) tool for managing COCO (Common Objects in Context) dataset files. This toolbox provides functionalities to split, merge, and crop COCO datasets, making it easier to manipulate and prepare datasets for machine learning tasks.

## Features

- **Split** a COCO dataset into training and testing datasets with a specified ratio.
- **Merge** multiple COCO dataset files into a single file.
- **Crop** images based on annotations in a COCO dataset.

## Streamlit App
You can use our streamlit app [here](https://coco-ml-toolbox.streamlit.app/). 
## Installation
cocomltools requires python >= 3.9.8
### Install python package
The easiest way to install the package is using pip

```
pip install cocomltools
```


You can also clone the source code and install dependencies using poetry:

```bash
git clone https://github.com/Ghallabi/coco-ml-toolbox.git
cd coco-ml-toolbox
pip install poetry
poetry install
```

## Usage

The main script for the COCO ML Toolbox provides three commands: split, merge, and crop. Below are the details on how to use each command.

### Split

Splits a COCO dataset into training and testing datasets.

```bash
python main_cli.py split --coco-path /path/to/coco.json --output-dir /path/to/output --ratio 0.2 --mode random
```

* --coco-path: Path to the COCO file (JSON).
* --output-dir: (Optional) Path to save the split COCO files. Defaults to the directory of the input COCO file.
* --ratio: (Optional) Split ratio. Defaults to 0.2.
* --mode: (Optional) Split mode. Options are `random` and `strat`. Defaults to random.


### Merge

Merges multiple COCO dataset files into a single file.

```bash
python main_cli.py merge --coco-paths /path/to/coco1.json,/path/to/coco2.json --output-dir /path/to/output
```

* --coco-paths: Comma-separated paths to the COCO files (JSON).
* --output-dir: (Optional) Path to save the merged COCO file. Defaults to the directory of the first input COCO file.

### Crop
Crops images based on annotations in a COCO dataset.

```bash
python main_cli.py crop --coco-path /path/to/coco.json --images-dir /path/to/images --output-dir /path/to/cropped_images --num-workers MAX_WORKERS
```

* --coco-path: Path to the COCO file (JSON).
* --images-dir: Path to the directory containing the COCO image files.
* --output-dir: (Optional) Path to save the cropped images. Defaults to a "cropped" directory within the parent directory of the images.
* --num-workers: to speed up the cropping process.

