Metadata-Version: 2.1
Name: ml-datasets
Version: 0.0.2
Summary: Machine Learning dataset loaders
Home-page: https://github.com/explosion/ml-datasets
Author: Explosion
Author-email: contact@explosion.ai
License: MIT
Description: <a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a>
        
        # Machine learning dataset loaders
        
        Loaders for various machine learning datasets for testing and example scripts.
        Previously in `thinc.extra.datasets`.
        
        [![Current Release Version](https://img.shields.io/github/release/explosion/ml_datasets.svg?style=flat-square&logo=github)](https://github.com/explosion/ml_datasets/releases)
        [![PyPi Version](https://img.shields.io/pypi/v/ml-datasets.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.python.org/pypi/ml-datasets)
        
        ## Setup and installation
        
        The package can be installed via pip:
        
        ```bash
        pip install ml-datasets
        ```
        
        ## Loaders
        
        Loaders can be imported directly or used via their string name (which is useful if they're set via command line arguments). Some loaders may take arguments – see the source of details.
        
        ```python
        # Import directly
        from ml_datasets import imdb
        train_data, dev_data = imdb()
        ```
        
        ```python
        # Load via registry
        from ml_datasets import loaders
        imdb_loader = loaders.get("imdb")
        train_data, dev_data = imdb_loader()
        ```
        
        ### Available loaders
        
        | ID / Function        | Description                                                 | From URL |
        | -------------------- | ----------------------------------------------------------- | :------: |
        | `imdb`               | IMDB sentiment dataset.                                     |    ✓     |
        | `mnist`              | MNIST data.                                                 |    ✓     |
        | `quora`              | Quora question answer dataset.                              |    ✓     |
        | `reuters`            | Reuters dataset.                                            |    ✓     |
        | `snli`               | Stanford Natural Language Inference corpus.                 |    ✓     |
        | `stack_exchange`     | Stack Exchange dataset.                                     |          |
        | `ud_ancora_pos_tags` | Universal Dependencies Spanish AnCora corpus (POS tagging). |    ✓     |
        | `ud_ewtb_pos_tags`   | Universal Dependencies English EWT corpus (POS tagging).    |    ✓     |
        | `wikiner`            | WikiNER data.                                               |          |
        
        ### Registering loaders
        
        Loaders can be registered externally using the `loaders` registry as a decorator. For example:
        
        ```python
        @ml_datasets.loaders("my_custom_loader")
        def my_custom_loader():
            return load_some_data()
        
        assert "my_custom_loader" in ml_datasets.loaders
        ```
        
Platform: UNKNOWN
Requires-Python: >=3.6
Description-Content-Type: text/markdown
