Metadata-Version: 2.1
Name: EDAeasy
Version: 1.1.3
Summary: Functions and tools for making Exploratory Data Analysis easy!
License: MIT
Keywords: Exploratory Analysis,EDA
Author: Francisco Jesus Ocazionez Cardozo
Author-email: pach812@gmail.com
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: numpy (>=1.24.0,<2.0.0)
Requires-Dist: pandas (>=1.5.0)
Requires-Dist: pingouin (>=0.5.4,<0.6.0)
Requires-Dist: statsmodels (>=0.14.2,<0.15.0)
Description-Content-Type: text/markdown

# EDAeasy 😀
The package for quick exploratory data analysis


## Instalation 

`pip install EDAeasy`

## Usage
The **dataframe_summary** function have relative simple summary of the columns of your dataframe
for quick look at tabular data

    Generate a summary DataFrame of the input DataFrame 'dataframe'.

    Parameters
    ----------
    dataframe : pandas.DataFrame
        The input DataFrame for which the summary needs to be generated.

    Returns
    -------
    pandas.DataFrame
        A DataFrame containing summary information for each column in 'df':
        - Type: Data type of the column.
        - Min: Minimum value in the column.
        - Max: Maximum value in the column.
        - Nan %: Percentage of NaN values in the column.
        - # Unique Values: Total number of unique values in the column.
        - Unique values: List of unique values in the column.

    Example
    -------
    >>> data = {
            'age': ['[40-50)', '[60-70)', '[70-80)'],
            'time_in_hospital': [8, 3, 5],
            'n_lab_procedures': [72, 34, 45],
            ...
        }
    >>> dataframe = pd.DataFrame(data)
    >>> result = dataframe_summary(df)
    >>> print(result)
               Type       Min        Max  Nan %  # Unique Values                                  Unique values
    Variables                                                                                                              
    age       object   [40-50)    [90-100)    0.0        3      ['[70-80)', '[50-60)', '[60-70)', '[40-50)', '[80-90)', ...
    time_in_hospital  int64    1           14    0.0        3        [8, 3, 5]
    n_lab_procedures  int64    1          113    0.0        3        [72, 34, 45]
    ...

    Note
    ----
    The function uses vectorized operations to improve performance and memory usage.

