Metadata-Version: 2.1
Name: vllm-haystack
Version: 0.1.1
Summary: A simple adapter to use vLLM in your Haystack pipelines. 
Project-URL: Documentation, https://github.com/LLukas22/vLLM-haystack-adapter#readme
Project-URL: Issues, https://github.com/LLukas22/vLLM-haystack-adapter/issues
Project-URL: Source, https://github.com/LLukas22/vLLM-haystack-adapter
Author-email: Lukas Kreussel <65088241+LLukas22@users.noreply.github.com>
License-Expression: MIT
License-File: LICENSE
Keywords: AI,Haystack,LLM,NLP
Classifier: Development Status :: 4 - Beta
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Python: >=3.7
Requires-Dist: farm-haystack>=1.20.0
Requires-Dist: requests>=2.31.0
Requires-Dist: sseclient>=0.0.27
Requires-Dist: tokenizers>=0.13.3
Description-Content-Type: text/markdown

# vLLM-haystack-adapter
[![PyPI - Version](https://img.shields.io/pypi/v/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack)

Simply use [vLLM](https://github.com/vllm-project/vllm) in your haystack pipeline, to utilize fast, self-hosted LLMs. 

<p align="center">
    <img alt="vLLM" src="https://raw.githubusercontent.com/vllm-project/vllm/main/docs/source/assets/logos/vllm-logo-text-light.png" width="45%" style="vertical-align: middle;">
    <a href="https://www.deepset.ai/haystack/">
        <img src="https://raw.githubusercontent.com/deepset-ai/haystack/main/docs/img/haystack_logo_colored.png" alt="Haystack" width="45%" style="vertical-align: middle;">
    </a>
</p>

## Installation
Install the wrapper via pip:  `pip install vllm-haystack`

## Usage
This integration provides two invocation layers:
- `vLLMInvocationLayer`: To use models hosted on a vLLM server (or any other OpenAI compatible server)
- `vLLMLocalInvocationLayer`: To use locally hosted vLLM models

### Use a Model Hosted on a vLLM Server
To utilize the wrapper the `vLLMInvocationLayer` has to be used. 

Here is a simple example of how a `PromptNode` can be created with the wrapper.
```python
from haystack.nodes import PromptNode, PromptModel
from vllm_haystack import vLLMInvocationLayer


model = PromptModel(model_name_or_path="", invocation_layer_class=vLLMInvocationLayer, max_length=256, api_key="EMPTY", model_kwargs={
        "api_base" : API, # Replace this with your API-URL
        "maximum_context_length": 2048,
    })

prompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)
```
The model will be inferred based on the model served on the vLLM server.
For more configuration examples, take a look at the unit-tests.

#### Hosting a vLLM Server

To create an *OpenAI-Compatible Server* via vLLM you can follow the steps in the 
Quickstart section of their [documentation](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server).

### Use a Model Hosted Locally
⚠️To run `vLLM` locally you need to have `vllm` installed and a supported GPU.

If you don't want to use an API-Server this wrapper also provides a `vLLMLocalInvocationLayer` which executes the vLLM on the same node Haystack is running on. 

Here is a simple example of how a `PromptNode` can be created with the `vLLMLocalInvocationLayer`.
```python
from haystack.nodes import PromptNode, PromptModel
from vllm_haystack import vLLMLocalInvocationLayer

model = PromptModel(model_name_or_path=MODEL, invocation_layer_class=vLLMLocalInvocationLayer, max_length=256, model_kwargs={
        "maximum_context_length": 2048,
    })

prompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)
```


