Metadata-Version: 2.1
Name: missing_data_navkiran
Version: 1.0.0
Summary: Handle Missing Data By Either Dropping Rows/Columns, Forward/Backward Filling or Imputing with Mean, Median or Mode
Home-page: https://github.com/navkiran/Missing_data_navkiran.git
Author: Navkiran Singh
Author-email: nsingh2_be17@thapar.edu
License: UNKNOWN
Description: # Library for Handling Missing Data
        
        ```
        PROJECT 3, UCS633 - Data Analysis and Visualization
        Navkiran Singh  
        COE17
        Roll number: 101703365
        ```
        
        Takes two inputs - filename of input csv, intended filename of output csv. 
        
        Two optional arguments - which must be provided together or left out.
        
        
        Resulting csv is saved with the name you provide. 
        
        
        ## Installation
        `pip install missing_data_navkiran`
        
        *Recommended - test in a virtual environment.* 
        
        ## Use via command line
        
        Defaults are drop NaN with parameter along = 0 (drops rows containing NaN)
         
        `missing_data_navkiran_cli in.csv out.csv`
        
        Drop rows with NaN
        `missing_data_navkiran_cli in.csv out.csv DROP 0` 
        
        Drop columns with NaN
        `missing_data_navkiran_cli in.csv out.csv DROP 1`
        
        Forward filling
        `missing_data_navkiran_cli in.csv out.csv FILL 0`
        
        Backward filling
        `missing_data_navkiran_cli in.csv out.csv FILL 1`
        
        Imputing with mean
        `missing_data_navkiran_cli in.csv out.csv IMPUTE 0`
        
        Imputing with median
        `missing_data_navkiran_cli in.csv out.csv IMPUTE 1`
        
        Imputing with mode
        `missing_data_navkiran_cli in.csv out.csv IMPUTE 2`
        
        First argument after outcli is the input csv filename from which the dataset is extracted. The second argument is for storing the final dataset after processing.
        
        ## Use in .py script
        ```
        from missing_data_navkiran import dropval,filler,impute
        input_df = pd.read_csv('in.csv')
        output_df = dropval(input_df,along=0)
        # Or
        output_df = filler(input_df,1)
        # Or
        output_df = impute(input_df,1)
        ```
        
        There are also stand alone functions to fill numerical data and fill categorical data.
        
        ```
        from missing_data_navkiran import fill_numerical,fill_categorical
        fill_numerical(input_df,list_of_numerical_columns)
        fill_categorical(input_df,list_of_categorical_columns)
        ```
        
Keywords: command-line,Missing-Data
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
