Metadata-Version: 2.1
Name: rellm
Version: 0.0.1
Summary: Get exact structure out of any language models with regular expressions.
License: MIT
Keywords: llm,ai,prompt,large language models,gpt-3,chatgpt
Author: Matt Rickard
Author-email: pypi@matt-rickard.com
Maintainer: Matt Rickard
Maintainer-email: pypi@matt-rickard.com
Requires-Python: >=3.8,<4.0
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Linguistic
Requires-Dist: numpy (>=1.24.3,<2.0.0)
Requires-Dist: regex (>=2023.5.5,<2024.0.0)
Requires-Dist: transformers (>=4.28.1,<5.0.0)
Project-URL: repository, https://github.com/r2d4/rellm
Description-Content-Type: text/markdown

# ReLLM
Regular Expressions for Language Model Completions.

Get exact structure out of any language model completion with regular expressions.

Return specific syntactic structure (e.g. JSON or XML), or specific semantic structure (e.g. a date or a number), or even complete templates (e.g. a sentence with a blank to fill in).

How does it work? For each token, ReLLM tests every possible completion against a partial regex. For the potential completions that do not match the pattern, ReLLM masks the logits so that the language model does not generate them.

### Installation
```
pip install rellm
```

The preliminary results are interesting -- even for small models, constraining the token space with ReLLM can improve the quality of the completions. Not to mention the ability to more easily parse the output programmatically. Take a look at some of the examples below (you can run them with [example.py](example.py))

## Examples using GPT2 (124 million parameters)

**Prompt**: Return the first three letters of the alphabet in a json array:

**Pattern** \[\"[a-z]\", \"[a-z]\", \"[a-z]\"\]

**ReLLM**: ["a", "b", "c"]

**Without ReLLM**: { "index": 0, "id":"1", "description":"", "text": "[{ "id": 0, "name":
#
**Prompt**: Fill in the sentence with an interesting story about the dentist:

**Pattern**: Today I\'m going to the [a-z]+ to [a-z]+ because ([a-z]+ )*\.

**ReLLM**: Today I'm going to the dentist to see because it is a very important day for me

**Without ReLLM**: 'My family bought me an appointment with a dentist when I was 15. The dentist gave me one a year and then I was told on
#

**Prompt**: Is this a good demo?

**Pattern**: (Yes|No)

**ReLLM**: No.

**Without ReLLM**: I don't know, but this is amazing! Even more amazing is how the design can take place on a small stage that uses LEDs.
As

#

**Prompt**: Convert the date May 4, 2023 to the format mm/dd/yyyy:

**Pattern**: [0-9]{2}/[0-9]{2}/[0-9]{4}

**ReLLM**: 00/00/0045

**Without ReLLM**:  mm:ss

A-Z, Z-A, W-H (0-9:9:19)

Z-R

#

**Prompt**: Jeff Dean is a

**Pattern** (Programmer|Computer Scientist|AGI)

**ReLLM**: Computer Scientist

**Without ReLLM**: former national basketball champion and a former professional basketball player. He currently serves as general counsel for the NCAA Office of the Vice President for Academic Affairs.

#

**Prompt**: I can eat 

**Pattern**: [0-9]{1,10} [a-z]* of [a-z]*

**ReLLM**: 800 calories of coffee

**Without ReLLM**: iced coffee here on the west side and do this, so can you?"

"Why, I don't understand. What did you mean by

#

**Prompt**: ReLLM, the best way to get structured data out of LLMs, is an acronym for

**Patern**: Re[a-z]* L[a-z]* L[a-z]* M[a-z]*

**ReLLM**: Realized Logistic Logistics Model

**Without ReLLM**: Largest Largest Address Space (MELSP), which has its roots in the  Internet network, at least when compared

