Metadata-Version: 2.1
Name: text-processor-nlp
Version: 0.1
Summary: A short description of the package
Home-page: https://github.com/SiddiqueFarhan/pypi_demo
Author: Farhan Siddiqui
Author-email: Farhan.siddiqui1572@gmail.com
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown

# TextPreprocessor

`TextPreprocessor` is a Python package for advanced text preprocessing. It provides various methods for cleaning, normalizing, and tokenizing text data. The package relies on the spaCy library for natural language processing tasks.

## Features

- **Text Cleaning:** Remove HTML tags, special characters, and extra whitespaces.
- **Lowercasing:** Convert text to lowercase.
- **Tokenization:** Tokenize text into words and sentences.
- **Stop Words Removal:** Remove common stop words from tokenized text.
- **Normalization:** Lemmatize tokens to their base forms.
- **Punctuation Handling:** Remove punctuation from text.
- **Number Removal:** Remove numerical digits from text.

## Installation

To use the `TextPreprocessor` package, you need to install both the package and its dependencies. You can do this using pip:

```bash
pip install textpreprocessor
