Metadata-Version: 1.1
Name: fancyimpute
Version: 0.0.13
Summary: Matrix completion and feature imputation algorithms
Home-page: https://github.com/iskandr/fancyimpute
Author: Alex Rubinsteyn and Sergey Feldman
Author-email: alex {dot} rubinsteyn {at} mssm {dot} edu
License: http://www.apache.org/licenses/LICENSE-2.0.html
Description: |Build Status| |Coverage Status| |DOI|
        
        fancyimpute
        ===========
        
        A variety of matrix completion and imputation algorithms implemented in
        Python.
        
        Usage
        -----
        
        .. code:: python
        
            from fancyimpute import BiScaler, KNN, NuclearNormMinimization, SoftImpute
        
            # X is the complete data matrix
            # X_incomplete has the same values as X except a subset have been replace with NaN
        
            # Use 3 nearest rows which have a feature to fill in each row's missing features
            X_filled_knn = KNN(k=3).complete(X_incomplete)
        
            # matrix completion using convex optimization to find low-rank solution
            # that still matches observed values. Slow!
            X_filled_nnm = NuclearNormMinimization().complete(X_incomplete)
        
            # Instead of solving the nuclear norm objective directly, instead
            # induce sparsity using singular value thresholding
            X_filled_softimpute = SoftImpute().complete(X_incomplete_normalized)
        
            # print mean squared error for the three imputation methods above
            nnm_mse = ((X_filled_nnm[missing_mask] - X[missing_mask]) ** 2).mean()
            print("Nuclear norm minimization MSE: %f" % nnm_mse)
        
            softImpute_mse = ((X_filled_softimpute[missing_mask] - X[missing_mask]) ** 2).mean()
            print("SoftImpute MSE: %f" % softImpute_mse)
        
            knn_mse = ((X_filled_knn[missing_mask] - X[missing_mask]) ** 2).mean()
            print("knnImpute MSE: %f" % knn_mse)
        
        Algorithms
        ----------
        
        -  ``SimpleFill``: Replaces missing entries with the mean or median of
           each column.
        
        -  ``KNN``: Nearest neighbor imputations which weights samples using the
           mean squared difference on features for which two rows both have
           observed data.
        
        -  ``SoftImpute``: Matrix completion by iterative soft thresholding of
           SVD decompositions. Inspired by the
           `softImpute <https://web.stanford.edu/~hastie/swData/softImpute/vignette.html>`__
           package for R, which is based on `Spectral Regularization Algorithms
           for Learning Large Incomplete
           Matrices <http://web.stanford.edu/~hastie/Papers/mazumder10a.pdf>`__
           by Mazumder et. al.
        
        -  ``IterativeSVD``: Matrix completion by iterative low-rank SVD
           decomposition. Should be similar to SVDimpute from `Missing value
           estimation methods for DNA
           microarrays <http://www.ncbi.nlm.nih.gov/pubmed/11395428>`__ by
           Troyanskaya et. al.
        
        -  ``MICE``: Reimplementation of `Multiple Imputation by Chained
           Equations <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3074241/>`__.
        
        -  ``MatrixFactorization``: Direct factorization of the incomplete
           matrix into low-rank ``U`` and ``V``, with an L1 sparsity penalty on
           the elements of ``U`` and an L2 penalty on the elements of ``V``.
           Solved by gradient descent.
        
        -  ``NuclearNormMinimization``: Simple implementation of `Exact Matrix
           Completion via Convex
           Optimization <http://statweb.stanford.edu/~candes/papers/MatrixCompletion.pdf>`__
           by Emmanuel Candes and Benjamin Recht using
           `cvxpy <http://www.cvxpy.org/en/latest/>`__. Too slow for large
           matrices.
        
        -  ``BiScaler``: Iterative estimation of row/column means and standard
           deviations to get doubly normalized matrix. Not guaranteed to
           converge but works well in practice. Taken from `Matrix Completion
           and Low-Rank SVD via Fast Alternating Least
           Squares <http://arxiv.org/abs/1410.2596>`__.
        
        .. |Build Status| image:: https://travis-ci.org/hammerlab/fancyimpute.svg?branch=master
           :target: https://travis-ci.org/hammerlab/fancyimpute
        .. |Coverage Status| image:: https://coveralls.io/repos/github/hammerlab/fancyimpute/badge.svg?branch=master
           :target: https://coveralls.io/github/hammerlab/fancyimpute?branch=master
        .. |DOI| image:: https://zenodo.org/badge/18834/hammerlab/fancyimpute.svg
           :target: https://zenodo.org/badge/latestdoi/18834/hammerlab/fancyimpute
        
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
