Metadata-Version: 2.1
Name: rtasr
Version: 0.0.5
Summary: 🏆 Run benchmarks against the most common ASR tools on the market.
Project-URL: Documentation, https://Wordcab.github.io/rtasr
Project-URL: Issues, https://github.com/Wordcab/rtasr/issues
Project-URL: Source, https://github.com/Wordcab/rtasr
Author: Wordcab
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Topic :: Internet
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: <3.12,>=3.8
Requires-Dist: aiofiles>=23.2.1
Requires-Dist: aiohttp>=3.8.5
Requires-Dist: aiopath>=0.5.12
Requires-Dist: datasets[audio]>=2.14.4
Requires-Dist: jiwer>=3.0.2
Requires-Dist: pydantic>=2.2.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: rich-argparse>=1.2.0
Requires-Dist: rich>=13.5.2
Requires-Dist: seaborn>=0.12.2
Requires-Dist: spy-der>=0.4.0
Provides-Extra: dev
Requires-Dist: rtasr[der,docs,quality,tests]; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-git-revision-date-localized-plugin~=1.1.0; extra == 'docs'
Requires-Dist: mkdocs-material~=8.5.4; extra == 'docs'
Requires-Dist: mkdocstrings[python]~=0.19.0; extra == 'docs'
Requires-Dist: mkdocs~=1.4.0; extra == 'docs'
Provides-Extra: quality
Requires-Dist: black~=22.10.0; extra == 'quality'
Requires-Dist: pre-commit~=2.20.0; extra == 'quality'
Requires-Dist: ruff~=0.0.263; extra == 'quality'
Provides-Extra: tests
Requires-Dist: pytest-asyncio>=0.21.1; extra == 'tests'
Requires-Dist: pytest-cov>=4.1; extra == 'tests'
Requires-Dist: pytest>=7.1.2; extra == 'tests'
Description-Content-Type: text/markdown

<h1 align="center">Rate That ASR (RTASR)</h1>

<div align="center">
	<a  href="https://pypi.org/project/rtasr" target="_blank">
		<img src="https://img.shields.io/pypi/v/rtasr.svg" />
	</a>
	<a  href="https://pypi.org/project/rtasr" target="_blank">
		<img src="https://img.shields.io/pypi/pyversions/rtasr" />
	</a>
	<a  href="https://github.com/Wordcab/rtasr/blob/main/LICENSE" target="_blank">
		<img src="https://img.shields.io/pypi/l/rtasr" />
	</a>
	<a  href="https://github.com/Wordcab/rtasr/actions?workflow=ci-cd" target="_blank">
		<img src="https://github.com/Wordcab/rtasr/workflows/ci-cd/badge.svg" />
	</a>
	<a  href="https://github.com/pypa/hatch" target="_blank">
		<img src="https://img.shields.io/badge/%F0%9F%A5%9A-Hatch-4051b5.svg" />
	</a>
</div>

<p align="center"><em>🏆 Run benchmarks against the most common ASR tools on the market.</em></p>

---

## Early Results

### DER

* Dataset: [AMI Corpus](http://groups.inf.ed.ac.uk/ami/corpus/)

![DER evaluation](./assets/der_evaluation_ami.png)

* Dataset: [VoxConverse](https://www.robots.ox.ac.uk/~vgg/data/voxconverse/)

![DER evaluation](./assets/der_evaluation_voxconverse.png)

### WER

Work in progress...

## Installation

### Last stable version

```bash
pip install rtasr
```

### From source

```bash
git clone https://github.com/Wordcab/rtasr
cd rtasr

pip install .
```

## Commands

The CLI is available through the `rtasr` command.

```bash
rtasr --help
```

### List datasets, metrics and providers

```bash
# List everything
rtasr list
# List only datasets
rtasr list -t datasets
# List only metrics
rtasr list -t metrics
# List only providers
rtasr list -t providers
```

### Datasets download

Available datasets are:

* `ami`: [AMI Corpus](http://groups.inf.ed.ac.uk/ami/corpus/)
* `voxconverse`: [VoxConverse](https://www.robots.ox.ac.uk/~vgg/data/voxconverse/)

```bash
rtasr download -d <dataset>
```

### ASR Transcription

#### Providers

Implemented ASR providers are:

* [x] `assemblyai`: [AssemblyAI](https://www.assemblyai.com/)
* [ ] `aws`: [AWS Transcribe](https://aws.amazon.com/transcribe/)
* [ ] `azure`: [Azure Speech](https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/)
* [x] `deepgram`: [Deepgram](https://www.deepgram.com/)
* [ ] `google`: [Google Cloud Speech-to-Text](https://cloud.google.com/speech-to-text)
* [x] `revai`: [RevAI](https://www.rev.ai/)
* [x] `speechmatics`: [Speechmatics](https://www.speechmatics.com/)
* [x] `wordcab`: [Wordcab](https://wordcab.com/)

#### Run transcription

Run ASR transcription on a given dataset with a given provider.

```bash
rtasr transcription -d <dataset> -p <provider>
```

#### Multiple providers

You can specify as many providers as you want:

```bash
rtasr transcription -d <dataset> -p <provider1> <provider2> <provider3> ...
```

#### Choose dataset split

You can specify the dataset split to use:

```bash
rtasr transcription -d <dataset> -p <provider> -s <split>
```

If not specified, all the available splits will be used.

#### Caching

By default, the transcription results are cached in the `~/.cache/rtasr/transcription` directory for each provider.

If you don't want to use the cache, use the `--no-cache` flag.

```bash
rtasr transcription -d <dataset> -p <provider> --no-cache
```

_Note: the cache is used to avoid running the same file twice. By removing the cache, you will run the transcription on the whole dataset again. We aren't responsible for any extra costs._

#### Debug mode

Use the `--debug` flag to run only one file by split for each provider.

```bash
rtasr transcription -d <dataset> -p <provider> --debug
```

### Evaluation

The `evaluation` command allows you to run an evaluation on the transcription results.

If you don't specify the split, the evaluation will be run on the whole dataset.

#### Run DER evaluation

```bash
rtasr evaluation -m der -d <dataset> -s <split>
```

#### Run WER evaluation

```bash
rtasr evaluation -m wer -d <dataset> -s <split>
```

### Plot results

To get the plots of the evaluation results, use the `plot` command.

If you don't specify the split, the plots will be generated for all the available splits.

#### Plot DER results

```bash
rtasr plot -m der -d <dataset> -s <split>
```

#### Plot WER results

```bash
rtasr plot -m wer -d <dataset> -s <split>
```

### Dataset length

To get the total length of a dataset, use the `audio-length` command.
This command allow you to get the number of minutes of audio for each split of a dataset.

If you don't specify the split, the total length of the dataset will be returned
for all the available splits.

```bash
rtasr audio-length -d <dataset> -s <split>
```

## Contributing

Be sure to have [hatch](https://hatch.pypa.io/latest/install/) installed.

### Quality

* Run quality checks: `hatch run quality:check`
* Run quality formatting: `hatch run quality:format`

### Testing

* Run tests: `hatch run tests:run`
