Metadata-Version: 2.1
Name: phoenix_datasets
Version: 0.0.1.dev0
Summary: PyTorch dataset wrappers for PHOENIX 2014 & PHOENIX-2014-T sign language datasets.
Home-page: https://github.com/enhuiz/phoenix_datasets
Author: enhuiz
Author-email: niuzhe.nz@outlook.com
License: UNKNOWN
Description: # PHOENIX Datasets 🐦
        
        ## Introduction
        
        [PHOENIX-2014](https://www-i6.informatik.rwth-aachen.de/~koller/RWTH-PHOENIX/) and [PHOENIX-2014-T](https://www-i6.informatik.rwth-aachen.de/~koller/RWTH-PHOENIX-2014-T/) are popular large scale German sign language datasets developed by Human Language Technology & Pattern Recognition Group from RWTH Aachen University, Germany. This package provides a PyTorch dataset wrapper for those two datasets to make the building of PyTorch model on these two datasets easier.
        
        ## Installation
        
        ```bash
        pip install git+https://github.com/enhuiz/phoenix-datasets
        ```
        
        ## Example Usage
        
        ```python
        from phoenix_datasets import PhoenixVideoTextDataset
        
        from torch.utils.data import DataLoader
        
        dtrain = PhoenixVideoTextDataset(
            # your path to this folder, download it from official website first.
            root="data/phoenix-2014-multisigner",
            split="train",
            p_drop=0.5,
            random_drop=True,
        )
        
        vocab = dtrain.vocab
        
        print("Vocab", vocab)
        
        dl = DataLoader(dtrain, collate_fn=dtrain.collate_fn)
        
        for batch in dl:
            video = batch["video"]
            text = batch["text"]
        
            # Do per-frame augmentation (e.g. normalization, cropping) here if needed.
            # kornia will be a good tool for this
            # video = augment(video)
        
            assert len(video) == len(text)
            print(len(video))
            print(video[0].shape)
            print(text[0].shape)
        
            break
        ```
        
        ## Supported Features
        
        - [x] Load the automatic alignments for PHOENIX-2014
        - [x] Randomly/evenly frame dropping augmentation
        
        ## TODOs
        
        - [ ] Implement Corpus for PHOENIX-2014-T
        - [ ] Evaluation Wrappers
        
Platform: UNKNOWN
Requires-Python: >=3.6.0
Description-Content-Type: text/markdown
