Metadata-Version: 2.1
Name: khnlp
Version: 0.0.3
Summary: Khmer NLP Toolkits
Home-page: https://github.com/IDRI-LAB/Khmer-NLP-Tools/
Author: LEANG Sotheara
Author-email: leangsotheara@gmail.com
License: UNKNOWN
Keywords: ,,
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.6
Description-Content-Type: text/markdown

## khnlp: Khmer NLP Toolkits

khnlp is a library for **advanced Khmer Natural Language Processing** in Python. 
It is developed by a research team of Cambodia Academy of Digital Technology ([CADT](http://cadt.edu.kh/)) since mid 2021, which is built on the very latest research, and was designed from day one to be used in real products.
 
### Features
* [Word Segmentation](https://github.com/IDRI-LAB/Khmer-NLP-Tools/tree/main/khnlp/khnlp/segment) is the process of dividing a string of written language into its component words.
Example: "សួស្តីអ្នកទាំងអស់គ្នា" => "សួស្តី | អ្នក | ទាំងអស់ | គ្នា"
* [Name Romanization](https://github.com/IDRI-LAB/Khmer-NLP-Tools/tree/main/khnlp/khnlp/romanize) is the process of converting Khmer script into the Roman (Latin) alphabet.
Example:  "សួស្តី" => "suo sdei"

### Installation
System requirements:
* Operation system: macOs / OSX, Linux, Windows
* Python version: 3.6+
* Package managers: pip

**pip**

`pip install khnlp==0.0.3`

### Inferences
* [Word Segmentation](https://github.com/IDRI-LAB/Khmer-NLP-Tools/tree/main/khnlp/inference/segment)
* [Name Romanization](https://github.com/IDRI-LAB/Khmer-NLP-Tools/tree/main/khnlp/inference/romanize)

### Contributors
* [Vichet Chea]
* [Sotheara Leang](mailto:leangsotheara@gmail.com)



