Metadata-Version: 2.1
Name: datanexus
Version: 0.0.3
Summary: A dataset module for your projects
Author: Ethan Barr
Author-email: ethanwbarr07@gmail.com
Description-Content-Type: text/markdown

# DataNexus

DataNexus is a simple to use Python module that you can use in your projects to get transcripts, datasets, etc.
The module also allows you to extract character lines from transcripts witch makes it easyer for you to be able to do finetunning of a GPT2 model as an example.

## Key feactures
- Downloading of Datasets and Transcripts
- Extract Characters from Transcripts


## Installation
To get started:
```
pip install datanexus
```



## Usage

⚠️ | Full documenation link to come in the future and the code may be unstable as in testing!

### Downloading of Datasets/Transcripts
```py
from datanexus import download_dataset_raw, download_dataset

datanexus = datanexus('Models/') # Insert your directory that you would like to use

model = datanexus.download_dataset(model='ironman') # Choose a model
print(model)
```

### Extract character's from transcripts
```py
from datanexus import save_character

datanexus = datanexus('Models/') # Insert your directory that you would like to use

character = datanexus.save_character(output_dir='Models', character='JARVIS:')
print(character)
```


## Support
If you have any question or any issues then feel free to create an issue on [Github](https://github.com/Ethan-Barr/DataNexus). 

Feel free to join [The Workshop](https://discord.gg/jbVcf5yPyE) discord server and send me a ping (`_Ethan_`)
