Metadata-Version: 2.1
Name: conv_html_to_markdown
Version: 0.1.0
Summary: Convert HTML to Markdown using Regex, BeautifulSoup4, and filter useless content with Jina Embeddings.
Author-Email: Daethyra <109057945+Daethyra@users.noreply.github.com>
License: MIT
Requires-Python: >=3.10
Requires-Dist: beautifulsoup4>=4.12.2
Requires-Dist: markdownify>=0.11.6
Requires-Dist: transformers>=4.36.2
Requires-Dist: torch>=2.1.2
Description-Content-Type: text/markdown

# Convert and Format HTML to Markdown

## Purpose

This module provides functionality for converting HTML to Markdown and
formatting a dataset of HTML content into structured Markdown, with added
capabilities of processing text embeddings to identify and
remove redundant content.

## Installation & Setup
* No API keys required