Metadata-Version: 2.1
Name: simager
Version: 0.1.2
Summary: Simple tools for auto classification and text preprocessing
Home-page: https://pypi.org/project/simager
Author: ulwan
Author-email: ulwan.nashihun@gmail.com
License: MIT
Keywords: nlp,text-processing,machine-learning,data-scientist,text-cleaner
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: emoji (==0.6.0)
Requires-Dist: beautifulsoup4 (==4.9.3)
Requires-Dist: scikit-learn (==0.24.2)
Requires-Dist: imbalanced-learn (==0.8.1)
Requires-Dist: lightgbm (==3.3.1)
Requires-Dist: xgboost (==1.5.0)
Requires-Dist: catboost (==1.0.3)
Requires-Dist: matplotlib (==3.3.4)
Requires-Dist: pandas (==1.1.5)
Requires-Dist: scikit-optimize (==0.9.0)
Requires-Dist: scipy (==1.5.4)

# simager
Tools for Auto Machine Learning and Text Preprocessing.
End to end ML research (preprocessing, modelling, hyperparameter tuning) just using a few line of codes

## Features
```
- Auto Classification
- Text Preprocessing
```

## Instalation
```
pip install simager
```
## Getting Started
- Auto Classification
```
from simager.ml import ConfigData, ConfigPreprocess, ConfigModel, AutoClassifier

config_data = ConfigData(
    target="target",
    cat_features = ["column1", "column2"],
    num_features = ["column3","column4", "column5"]
)
config_preprocess = ConfigPreprocess(
    cat_imputer="SimpleImputer",
    num_imputer="SimpleImputer",
    scaler="RobustScaler",
    encoder="OneHotEncoder"
)
config_model=ConfigModel(algoritm=algoritm=[
    "DecisionTreeClassifier",
    "KNeighborsClassifier",
    "LogisticRegression",
    "SVC",
    "RandomForestClassifier",
    "AdaBoostClassifier",
    "XGBClassifier",
    "LGBMClassifier",
    "CatBoostClassifier"
])

model = AutoClassifier(config_data = config_data,
                 config_preprocess=config_preprocess,
                 config_model=config_model)

model.fit(df)

model.hp_tuning()
```

- Text Preprocessing
```
from simager.preprocess import TextPreprocess

methods = [
    "rm_hastag",
    "rm_mention",
    "rm_nonascii",
    "rm_emoticons",
    "rm_html",
    "rm_url",
    "sparate_str_numb",
    "pad_punct",
    "rm_punct",
    "rm_repeat_char",
    "rm_repeat_word",
    "rm_numb",
    "rm_whitespace",
    "normalize",
    "stopwords"
]

cleaner = TextPreprocess(methods=methods)

cleaner("your text here)
```

Full Example of Usage [Here](https://github.com/ulwan/simager/tree/master/simager/example)

