Metadata-Version: 2.1
Name: seema
Version: 0.2.1
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM

# Seema
Khmer, Lao, Myanmar, and Thai word segmentation/breaking library in Python written in Rust

# Example

```
>>> from seema import Seema
>>> Seema("words_th.txt", "thai_cluster_rules.txt")
>>> s.segment_into_strings("ไก่จิกเด็ก")
['ไก่', 'จิก', 'เด็ก']
```

words_th.txt and thai_cluster_rules.txt can be downloaded from https://codeberg.org/mekong-lang/chamkho/src/branch/main/data.

