Metadata-Version: 2.1
Name: ncvis
Version: 1.5.6
Summary: Noise contrastive data visualization
Home-page: https://github.com/stat-ml/ncvis
Maintainer: Aleksandr Artemenkov
Maintainer-email: alartum@gmail.com
License: MIT
Project-URL: Source Code, https://github.com/stat-ml/ncvis
Description: [![Conda](https://anaconda.org/alartum/ncvis/badges/version.svg)](https://anaconda.org/alartum/ncvis)
        [![PyPI](https://img.shields.io/pypi/v/ncvis.svg)](https://pypi.python.org/pypi/ncvis/)
        [![GitHub](https://img.shields.io/github/license/alartum/ncvis.svg)](https://github.com/alartum/ncvis/blob/master/LICENSE)
        [![Build Status](https://dev.azure.com/stat-ml/ncvis/_apis/build/status/stat-ml.ncvis?branchName=master)](https://dev.azure.com/stat-ml/ncvis/_build/latest?definitionId=1&branchName=master)
        
        # ncvis
        
        **NCVis** is an efficient solution for data visualization and dimensionality reduction. It uses [HNSW](https://github.com/nmslib/hnswlib) to quickly construct the nearest neighbors graph and a parallel (batched) approach to build its embedding. Efficient random sampling is achieved via [PCGRandom](https://github.com/imneme/pcg-cpp). Detailed application examples can be found [here](https://github.com/alartum/ncvis-examples).
        
        # Why NCVis?
        
        ## It is Fast
        
        We use preprocessed samples from the [News Headlines Of India dataset](https://www.kaggle.com/therohk/india-headlines-news-dataset) to perform the comparison. Test cases are generated by taking the first 1000, 2 · 1000, . . . , 2¹⁰ · 1000 samples from the dataset. Given the same amount of time **NCVis** allows to process more than double number of samples compared to other methods, visualizing **10⁶** points in only **6** minutes (12 × Intel® CoreTM i7-8700K CPU @
        3.70GHz, 64 Gb RAM).
        
        <p align="center">
          <img width="400" alt="Speed Comparison" src="https://github.com/stat-ml/ncvis-examples/blob/master/img/time_all.png?raw=true">
        </p>
        
        ## It is Efficient
        
        One can define efficiency as the ratio of the time to execute the task on a single processor to the time on multiple processors. Ideally, the efficiency should be equal to the num-
        ber of threads. **NCVis** does not achieve this limit but signifi-
        cantly outperforms other methods. We used 10000 samples from the [News Headlines Of India dataset](https://www.kaggle.com/therohk/india-headlines-news-dataset).
        
        <p align="center">
          <img width="400" alt="Efficiency Comparison" src="https://github.com/stat-ml/ncvis-examples/blob/master/img/efficiency.png?raw=true">
        </p>
        
        ## It is Predictable
        
        It is important that the proposed method has predictable behavior on simple datasets. We used the [Optical Recognition of Handwritten Digits Data Set](https://archive.ics.uci.edu/ml/datasets/optical+recognition+of+handwritten+digits) which comprised 5620 preprocessed handwritten digits and thus has a simple structure that is assumed to be revealed by visualization. **NCVis** shows the behavior consistent with classical methods like t-SNE while producing visualization up to the order of magnitude faster.
        
        | t-SNE (29.5s)   |   FIt-SNE (17.4s) |
        :-------------------------:|:-------------------------:
        <img width="300" alt="t-SNE" src="https://github.com/stat-ml/ncvis-examples/blob/master/img/t-SNE.png?raw=true"> | <img width="300" alt="FIt-SNE" src="https://github.com/stat-ml/ncvis-examples/blob/master/img/FIt-SNE.png?raw=true">
        
        | Multicore t-SNE (14.3s) |  LargeVis (9.7s)|
        :-------------------------:|:-------------------------:
        <img width="300" alt="Multicore t-SNE" src="https://github.com/stat-ml/ncvis-examples/blob/master/img/Multicore%20t-SNE.png?raw=true"> | <img width="300" alt="LargeVis" src="https://github.com/stat-ml/ncvis-examples/blob/master/img/LargeVis.png?raw=true">
        
        | Umap (7.5s)  |  NCVis (0.9s)|
        :-------------------------:|:-------------------------:
        <img width="300" alt="Umap" src="https://github.com/stat-ml/ncvis-examples/blob/master/img/Umap.png?raw=true"> | <img width="300" alt="NCVis" src="https://github.com/stat-ml/ncvis-examples/blob/master/img/NCVis.png?raw=true">
        
        # Using
        
        ```python
        import ncvis
        
        vis = ncvis.NCVis()
        Y = vis.fit_transform(X)
        ```
        
        More detailed examples can be found [here](https://github.com/alartum/ncvis-examples).
        
        # Installation
        
        ## Conda [recommended]
        
        You do not need to setup the environment if using *conda*, all dependencies are installed automatically. 
        ```bash
        $ conda install alartum::ncvis 
        ```
        
        ## Pip [not recommended]
        
        **Important**: be sure to have a compiler with *OpenMP* support. *GCC* has it by default, which is not the case with *clang*. You may need to install *llvm-openmp* library beforehand.  
        
        1. Install **numpy** and **cython** packages (compile-time dependencies):
            ```bash
            $ pip install numpy cython
            ```
        2. Install **ncvis** package:
            ```bash
            $ pip install ncvis
            ```
        
        ## From source [not recommended]
        
        **Important**: be sure to have *OpenMP* available.
        
        First of all, download the *pcg-cpp* and *hnswlib* libraries:
        ```bash
        $ make libs
        ``` 
        ### Python Wrapper 
        
        If *conda* environment is used, it replaces library search paths. To prevent compilation errors, you either need to use compilers provided by *conda* or switch to *pip*  and system compilers. 
        
        * Conda
            ```bash
            $ conda install conda-build numpy cython scipy
            $ conda install -c conda-forge cxx-compiler c-compiler
            $ conda-develop -bc .
            ``` 
        
        * Pip
            ```bash
            $ pip install numpy cython
            $ make wrapper
            ```
        
        You can then use *pytest* to run some basic checks
        ```bash
        $ pytest -v recipe/test.py
        ```
        
        
        ### C++ Binary
        
        * Release
            ```bash
            $ make ncvis
            ```
        
        * Debug
            ```bash
            $ make debug
            ```
        
Platform: UNKNOWN
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: C++
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3
Description-Content-Type: text/markdown
