Metadata-Version: 2.1
Name: tokflow
Version: 1.1.0
Summary: LLM utility of streaming token realtime replacement processing
Home-page: https://github.com/riversun/TokFlow
Author: Tom Misawa
Author-email: riversun.org@gmail.com
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE

# TokFlow

[&#26085;&#26412;&#35486;](https://github.com/riversun/TokFlow/blob/main/README_ja.md)



Utility that outputs tokens generated by a large language model (LLM) with sequential replacement processing

## How it works

The tokens are entered one after the other as small pieces as shown below.

```python
["He","llo"," ","t","h","ere","!<","N","L>m","y ","nam","e"," ","is"," tokfl","ow.","<","N","L>N","ice"," to ","me","et you."]
```

The input tokens are output, with `<NL>` replaced by `\n` each time.


![tokflow](https://github.com/riversun/TokFlow/assets/11747460/85f497bd-cf51-41d9-aaf5-ad5420f42b6a)


You can specify any string to be replaced.
Moreover, you can specify multiple replacement targets.

## What is this library for?

I developed this for the purpose of outputting special tokens with successive replacements in sequential sentence generation using a large-scale language model, which is a generative AI, but it may also be used for other string stream processing.

# Install

```
pip install tokflow
```

# Usage

```python
import time
from tokflow import TokFlow

TOKEN_GENERATOR_MOCK = ["He", "llo", " ", "t", "h", "ere", "!<", "N", "L>m", "y ", "nam", "e", " ", "is", " tokfl", "ow.",
                  "<", "N", "L>N", "ice", " to ", "me", "et you."]

# replace "<NL>" with "\n". "<NL>" is called "search target string".
# Multiple replacement conditions can be specified.
tokf = TokFlow([("<NL>", "\n")])

for input_token in TOKEN_GENERATOR_MOCK:

    output_token = tokf.put(input_token)

    # Input sequential tokens.
    # If there is a possibility that the token is a "search target string",
    # it is buffered for a while, so output_token may be empty for a while.
    print(f"{output_token}", end="", flush=True)

    # Included wait to show the sequential generation operation.
    time.sleep(0.3)


# Remember to output the remaining buffer at the very end. Buffers may be empty characters.
print(f"{tokf.flush()}", end="", flush=True)

```


![tokflow](https://github.com/riversun/TokFlow/assets/11747460/85f497bd-cf51-41d9-aaf5-ad5420f42b6a)

# Generation Options

The `put` method can take an optional parameter `opts` like `put(text,opts)`.

`opts` can specify the format of the input and output, like `{"in_type":"spot","out_type:"spot" }`.

It behaves as follows:

| in_type  | out_type | Description                                    |
| :------- | :------- | :---------------------------------------------- |
| spot     | spot     | A mode that incrementally sends tokens to the `put` method, and outputs generated segments each time. |
| spot     | full     | A mode that incrementally sends tokens to the `put` method, but outputs the full sentence. |
| full     | spot     | A mode that sends the full sentence to the `put` method at once, but outputs generated segments each time. |
| full     | full     | A mode that sends the full sentence to the `put` method at once, and outputs the full sentence. |

Notes:
- All text strings need to be sent to the `put` method before calling the `flush` method. Especially in `full` mode, all input strings are sent at once.
- If the output type (`out_type`) is `full`, the `flush` method must be called to obtain the final result.
- It's important to appropriately combine the call pattern of the `put` method and the use of the `flush` method to maintain consistency in each mode.

**Code Example**

Specify rules like `condition = {"in_type": "full", "out_type": "full"}`, and use `condition` as an argument for `put` and `flush`.

```python
    tokf = TokFlow([("<NL>", "\n")])

    condition = {"in_type": "full", "out_type": "full"}
    prev_len = 0
    for input_token_base in get_example_texts():
        output_sentence = tokf.put(input_token_base, condition)

        print(f"output_sentence:{output_sentence}")

        if prev_len > len(output_sentence):
            raise ValueError("Length error")

        if "<NL>" in output_sentence:
            raise Exception("Failure Must be converted str found.")

        prev_len = len(output_sentence)

    output_sentence = tokf.flush(condition)
```

# Processing

## About Internal processing

Tokens are sequentially read in real time.
The token read is combined with the tokens read so far, referred to as the "token buffer".
In this sequential process, when a pre-specified string (hereafter referred to
as the "search target string") appears in the token buffer,
this string is replaced with another string (hereafter referred to as the "replacement string").
Since tokens are read sequentially, in the intermediate stage,
a string that is unrelated to the search target string or part of the search target string accumulates
in the token buffer. If the token buffer is composed in an order that cannot be a search target string,
the token buffer is returned as the method's return value the moment such a determination is made.
On the other hand, if the token buffer is composed in an order that could be a search target string,
the return value remains an empty string until either the search target string appears or
it is determined that it cannot be a search target string.
In this way, by buffering until the appearance of the search target string,
most sequential tokens can be displayed as they are, while replacement is delayed when necessary,
enabling stream processing.

