Metadata-Version: 2.1
Name: gower
Version: 0.0.3
Summary: Python implementation of Gowers distance, pairwise between records in two data sets
Home-page: https://github.com/wwwjk366/gower
Author: Michael Yan
Author-email: tanbingy@gmail.com
License: MIT
Description: <!-- badges: start -->
        [![Build Status](https://travis-ci.com/wwwjk366/gower.svg?branch=master)](https://travis-ci.com/wwwjk366/gower)
        [![PyPI version](https://badge.fury.io/py/gower.svg)](https://pypi.org/project/gower/)
        <!-- badges: end -->
        
        # Introduction
        
        Gower's distance calculation in Python. Gower Distance is a distance measure that can be used to calculate distance between two entity whose attribute has a mixed of categorical and numerical values. [Gower (1971) A general coefficient of similarity and some of its properties.](https://www.jstor.org/stable/2528823?seq=1) Biometrics 27 857–874.
        
        
        # Examples
        
        ## Installation
        
        ```
        pip install gower
        ```
        
        ## Generate some data
        
        ```python
        import numpy as np
        import pandas as pd
        import gower
        
        Xd=pd.DataFrame({'age':[21,21,19, 30,21,21,19,30,None],
        'gender':['M','M','N','M','F','F','F','F',None],
        'civil_status':['MARRIED','SINGLE','SINGLE','SINGLE','MARRIED','SINGLE','WIDOW','DIVORCED',None],
        'salary':[3000.0,1200.0 ,32000.0,1800.0 ,2900.0 ,1100.0 ,10000.0,1500.0,None],
        'has_children':[1,0,1,1,1,0,0,1,None],
        'available_credit':[2200,100,22000,1100,2000,100,6000,2200,None]})
        Yd = Xd.iloc[1:3,:]
        X = np.asarray(Xd)
        Y = np.asarray(Yd)
        
        ```
        
        ## Find the distance matrix
        
        ```python
        gower.gower_matrix(X)
        ```
        
        
        
        
            array([[0.        , 0.3590238 , 0.6707398 , 0.31787416, 0.16872811,
                    0.52622986, 0.59697855, 0.47778758,        nan],
                   [0.3590238 , 0.        , 0.6964303 , 0.3138769 , 0.523629  ,
                    0.16720603, 0.45600235, 0.6539635 ,        nan],
                   [0.6707398 , 0.6964303 , 0.        , 0.6552807 , 0.6728013 ,
                    0.6969697 , 0.740428  , 0.8151941 ,        nan],
                   [0.31787416, 0.3138769 , 0.6552807 , 0.        , 0.4824794 ,
                    0.48108295, 0.74818605, 0.34332284,        nan],
                   [0.16872811, 0.523629  , 0.6728013 , 0.4824794 , 0.        ,
                    0.35750175, 0.43237334, 0.3121036 ,        nan],
                   [0.52622986, 0.16720603, 0.6969697 , 0.48108295, 0.35750175,
                    0.        , 0.2898751 , 0.4878362 ,        nan],
                   [0.59697855, 0.45600235, 0.740428  , 0.74818605, 0.43237334,
                    0.2898751 , 0.        , 0.57476616,        nan],
                   [0.47778758, 0.6539635 , 0.8151941 , 0.34332284, 0.3121036 ,
                    0.4878362 , 0.57476616, 0.        ,        nan],
                   [       nan,        nan,        nan,        nan,        nan,
                           nan,        nan,        nan,        nan]], dtype=float32)
        
        
        ## Find Top n results
        
        ```python
        gower.gower_topn(Xd.iloc[0:2,:], Xd.iloc[:,], n = 5)
        ```
        
        
        
        
            {'index': array([4, 3, 1, 7, 5]),
             'values': array([0.16872811, 0.31787416, 0.3590238 , 0.47778758, 0.52622986],
                   dtype=float32)}
        
        
        
Keywords: gower,distance,matrix
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.7
Description-Content-Type: text/markdown
