Metadata-Version: 2.1
Name: dateprepkit
Version: 0.2
Summary: DataPrepKit is a Python class for data preparation and analysis. It provides functionalities for reading various data formats, summarizing statistics, handling missing values, and encoding categorical data.
Home-page: https://github.com/ahmed-eldesoky284/dateprepkit
Author: Ahmed Eldesoky
Author-email: ahmedeldesoky284@gmail.com
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: pandas

# DataPrepKit

DataPrepKit is a Python utility class for simplifying common data preparation tasks such as reading data from different file formats, generating summary statistics, handling missing values, and encoding categorical data.

## Installation

You can install DataPrepKit using pip:

```
pip install dateprepkit
```

# Usage

To use DataPrepKit in your Python project, follow the steps below:

```
from data_prep_kit import DataPrepKit
import pandas as pd

# Sample data
data = pd.read_csv('your_data.csv')

# Initialize DataPrepKit object
data_prep = DataPrepKit(data)
```
1. Reading Data
Use the read_data method to load data from various file formats such as CSV, Excel, or JSON:
```
# Read data from a CSV file
data = data_prep.read_data('your_data.csv', format='csv')
```

2. Generating Data Summary
Generate summary statistics for the loaded data using the data_summary method:
```
# Generate summary statistics
summary = data_prep.data_summary()
print(summary)
```

3. Handling Missing Values
Handle missing values in the DataFrame by either removing or imputing them using the handle_missing_values method:
```
# Handle missing values by removing rows with missing values
cleaned_data = data_prep.handle_missing_values(strategy='remove')
```

4. Encoding Categorical Data
Encode categorical columns in the DataFrame using one-hot encoding with the encode_categorical_data method:
```
# Encode categorical columns
encoded_data = data_prep.encode_categorical_data(categorical_columns=['category'])
```

Example

Here's a complete example of how to use DataPrepKit:
```
from data_prep_kit import DataPrepKit
import pandas as pd

# Sample data
data = pd.read_csv('your_data.csv')

# Initialize DataPrepKit object
data_prep = DataPrepKit(data)

# Read data from a CSV file
data = data_prep.read_data('your_data.csv', format='csv')

# Generate summary statistics
summary = data_prep.data_summary()
print(summary)

# Handle missing values by removing rows with missing values
cleaned_data = data_prep.handle_missing_values(strategy='remove')

# Encode categorical columns
encoded_data = data_prep.encode_categorical_data(categorical_columns=['category'])
```
