Metadata-Version: 2.1
Name: wordpress-recommender
Version: 0.1.3
Summary: An application to recommend articles from a Wordpress blog based on user's query using semantic search
Author: Anurag Chatterjee
Requires-Python: >=3.9,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: beautifulsoup4 (>=4.12.2)
Requires-Dist: chromadb (>=0.4.13)
Requires-Dist: langchain (>=0.0.262)
Requires-Dist: pandas (>=2.1.0)
Requires-Dist: sentence-transformers (>=2.2.2)
Requires-Dist: tiktoken (>=0.4.0)
Requires-Dist: typer[all] (>=0.9.0)
Description-Content-Type: text/markdown

# Wordpress articles recommender app using Python
An application to recommend articles from a Wordpress blog based on user's query using semantic search

## Pre-requisites:
- Adjust the below environment variables before running any of the below steps if required:
    - `DATA_LOC`: Location where all the application data will be stored. By default, it is stored in a `data` folder in the current directory. This folder will be created if it does not exist during creation.
    - HuggingFace key (free) to use their inference endpoints to generate embeddings using Sentence Transformer models. Export the key to an environment variable called `HF_KEY` before creating or querying the index as shown in the steps below.
    - Override the default HuggingFace embeddings model by specifying the name in the environment variable `HF_EMBEDDINGS_MODEL_NAME`. If not provided, the `sentence-transformers/all-MiniLM-l6-v2` model is used to generate embeddings.

## How to use the CLI after installing the package from PyPI:
1. Download content: `wordpress-recommender download-content "https://learnwoo.com/post-sitemap1.xml"`
2. Build index using the downloaded content: `wordpress-recommender generate-index "https://learnwoo.com/post-sitemap1.xml"`
3. Query index built in previous step: `wordpress-recommender query-index "https://learnwoo.com/post-sitemap1.xml" --query "How are the AI regulations different in different parts of the world?"`

## Release notes:

### 0.1.1
- Initial release

### 0.1.2
- Allow client to override the default HuggingFace inference API embeddings model via the environment variable `HF_EMBEDDINGS_MODEL_NAME`
- Allow client to rebuild index by providing the option `--rebuild_index` in the CLI
- *Breaking change* Return final output as a sorted pandas dataframe with all original columns present along with a new column for the score e.g. `cosine_distance` or `cosine_similarity`
- Allow client to override the number of results returned by passing an optional `--top` parameter with an integer value

### 0.1.3
- Minor updates to docs
- Add tests
- Add GitHub Actions to publish package


