Metadata-Version: 2.1
Name: document-embedding
Version: 0.0.1
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: numba
Requires-Dist: sentence-transformers
Requires-Dist: torch
Requires-Dist: mcerp
Requires-Dist: nltk
Provides-Extra: dev
Requires-Dist: isort; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: rope; extra == "dev"
Requires-Dist: toml; extra == "dev"
Requires-Dist: yapf; extra == "dev"
Provides-Extra: test
Requires-Dist: pytest; extra == "test"

# Document Embeddings

Based on the results of: https://arxiv.org/abs/2304.14796

## Implemented Methods

- Average Pooling, with adjustable range for sentences used.
- PERT weighted average pooling


## Usage

Wrapper for Sentence-Embedding, which is used to provide embedding functionality

```python

from sentence_transformers import SentenceTransformer

sentence_model = SentenceTransformer("paraphrase-multilingual-MiniLM-L12-v2")

document_model = AverageDocumentEmbedding(sentence_model, language='german')


doc1 = "Arbitrary text"
doc2 = "..."

document_model.encode([doc1, doc2, ...])


```
