Metadata-Version: 2.1
Name: document-embedding
Version: 0.0.1
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: numba
Requires-Dist: sentence-transformers
Requires-Dist: torch
Requires-Dist: mcerp
Requires-Dist: nltk
Provides-Extra: dev
Requires-Dist: isort ; extra == 'dev'
Requires-Dist: pytest ; extra == 'dev'
Requires-Dist: rope ; extra == 'dev'
Requires-Dist: toml ; extra == 'dev'
Requires-Dist: yapf ; extra == 'dev'
Provides-Extra: test
Requires-Dist: pytest ; extra == 'test'

# Document Embeddings

Based on the results of: https://arxiv.org/abs/2304.14796

## Implemented Methods

- Average Pooling, with adjustable range for sentences used.
- PERT weighted average pooling


## Usage

Wrapper for Sentence-Embedding, which is used to provide embedding functionality

```python

from sentence_transformers import SentenceTransformer

sentence_model = SentenceTransformer("paraphrase-multilingual-MiniLM-L12-v2")

document_model = AverageDocumentEmbedding(sentence_model, language='german')


doc1 = "Arbitrary text"
doc2 = "..."

document_model.encode([doc1, doc2, ...])


```
