Metadata-Version: 2.1
Name: BashaFusion
Version: 0.0.2
Summary: A package used to convert indic language to iast & iast to inidc langauge viceversa
Home-page: https://github.com/dankarthik25/BashaFusion
Author: Dan Karthik
Author-email: dankarthik25@gmail.com
License: MIT
Project-URL: Documentation, https://dankarthik25.github.io/BashaFusion
Project-URL: Source, https://github.com/dankarthik25/BashaFusion
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.9
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: pytest >=7.0 ; extra == 'dev'
Requires-Dist: twine >=4.0.2 ; extra == 'dev'

# BashaFusion
A package used to convert indic language to `iast` and `iast` to inidc langauge viceversa.

### Installation
```bash
pip install BashaFusion  # still in development
```




### Usage
```python
import sqlite3
import os
import sys

from BashaFusion import BashaFusion

#create a IAST object
bshf = BashaFusion()

# customization
# bshf = BashaFusion(db_path='iast-token.db', table_name_alpha='IndianAlphabet',table_name_barakadi='Barakhadi')
```

### Converting All Indic language(hinid, telugu, kannada, Malayalam, Odia, Bengali&Assamese, Gujarati, tamil) to iast 
**`InProgress`** Research and Analysis is going on in  Tamil Script, Nastaliq Script, Sinhala Script.

```python
bshf.to_iast('''ଧୃତରାଷ୍ଟ୍ର ଉଵାଚ |\tধৃতরাষ্ট্র উবাচ |\tધૃતરાષ્ટ્ર ઉવાચ |\tத்றுதராஷ்ட்ர உவாச |''')
# >>> 
# dhr̥tarāṣṭra uvāca |	dhr̥tarāṣṭra ubāca |	dhr̥tarāṣṭra uvāca |	ta்ṟutarāṣa்ṭa்ra uvāca |
```


### Convert `iast` to Indic Language 
Currently this can convert `IAST` to kannada, hindi, telugu, malyalam 

```python
word = 'kaṁ  itāḥ kiṁ  yuyutsavaḥ kl̥̄ kl̥ pāṇḍavānīkaṁ itāḥ kiṁ āṁ  īṁ   yuyutsuṁ  kiṁ rānsakhīṁstathā'
print(IAST.iast2tokens( word) )
# >>> ['k', 'a', 'ṁ', '  ', 'i', 't', 'ā', 'ḥ', ' ', 'k', 'i', 'ṁ', '  ', 'y', 'u', 'y', 'u', 't', 's', 'a', 'v', 'aḥ', ' ', 'k', 'l̥̄', ' ', 'k', 'l̥', ' ', 'p', 'ā', 'ṇ', 'ḍ', 'a', 'v', 'ā', 'n', 'ī', 'k', 'a', 'ṁ', ' ', 'i', 't', 'ā', 'ḥ', ' ', 'k', 'i', 'ṁ', ' ', 'ā', 'ṁ', '  ', 'ī', 'ṁ', '   ', 'y', 'u', 'y', 'u', 't', 's', 'u', 'ṁ', '  ', 'k', 'i', 'ṁ', ' ', 'r', 'ā', 'n', 's', 'a', 'kh', 'ī', 'ṁ', 's', 't', 'a', 'th', 'ā']
indic_lang = 'Telugu' # 'Kannada' # 'Telugu', 'Odia', 'Gujarati', 'Bengali-Assamsese', 
# indic_lang='Devanagari'
# indic_lang='Kannada'
# indic_lang='Telugu'
# indic_lang='Odia'
# indic_lang='Bengali–Assamese'
# indic_lang='Tamil' # In development state

print(bshf.iast2indic(word,indic_lang))



IAST.dict_tokens2indic(dict_tokene_list,halant)
# >>> కం  ఇతాః కిం  యుయుత్సవః  పాణ్డవానీకం ఇతాః కిం ఆం  ఈం  కిం యుయుత్సుం రాన్సఖీంస్తథా
```


### Phonetic Hash for Phonetic Search 
```python
search_word = 'dhr̥tarāṣṭra uvāca'
search_word = search_word.strip().lower()



# to_iast
search_iast = bshf.to_iast(search_word) # similar to idempotent matrx no loss of info if ':' not present
# >>> dhr̥tarāṣṭra uvāca
print(search_iast)

print("# Original Text:", search_word)
print('BASIC HASHING: ',IAST.basic_hash(search_iast))
print('NORMAL HASHING',IAST.normal_hash(search_iast))

# >>> Original Test: dhr̥tarāṣṭra uvāca
# >>> BASIC HASHING: drtrstr vc
# >>> NORMAL HASHING: drtarastra uvaca

```




### Contribute
**`InProgress`** Research and Analysis is going on in  Tamil Script, Nastaliq Script(Urdu, Arabic), Sinhala(Sri Lanka) Script. <br>
**`Not Strated`** Marathi can easily added into this package


### Issue
Please [open an issue ](https://github.com/dankarthik25/BashaFusion/issues "open an issue ")<br>
here in case any bug was encountered. 
Mail id : dankarthik25@gmail.com
