A modern Pythonic implementation of the popular Bengali phonetic-typing software Avro Phonetic.
avro.py provides a fully fledged, batteries-included text parser which can parse, reverse and even convert English Roman script into its phonetic equivalent (unicode) of Bengali. At its core, it implements an extensively modified version of the Avro Phonetic Dictionary Search Library by Mehdi Hasan Khan.
Important
Update: As of October 2024, Python 3.8 has reached its EOL, so for keeping
this project updated, the minimum required version will be Python 3.9 from now
onwards. It is strongly suggested that you migrate your project for better
compatibility.
This package is inspired from Rifat Nabi's jsAvroPhonetic library and derives from Kaustav Das Modak's pyAvroPhonetic.
This package requires Python 3.9 or higher to be used inside your development environment.
# Install / upgrade.
$ pip install avro.py
avnie is a newly developed CLI tool that uses avro.py under the hood. You can install it using:
# Install / upgrade avnie.
$ pip install avnie
This small tour guide will describe how you can use avro.py back and forth to operate (cutlery!) on Bengali text. You can also check the examples directory for checking this whole snippet in action, as well as other use cases.
parse()
Let's assume you want to parse a single English string to Bengali, for example "ami banglay gan gai."
. You can convert it like this:
# Import the package.
import avro
# Our dummy text.
dummy = 'ami banglay gan gai.'
# Parse a single string.
parsed = avro.parse(dummy)
print(parsed) # Output: আমি বাংলায় গান গাই।
1.a parse_iter()
If you have multiple strings, use parse_iter()
to get a list of parsed results:
texts = ['ami banglay gan gai.', 'tumi kothay jao?']
parsed_list = avro.parse_iter(texts)
print(parsed_list) # Output: ['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?']
parse(bijoy=True)
Alternatively, I can also do it in Bijoy Keyboard format:
bijoy_output = avro.parse(dummy, bijoy=True)
# Output: Avwg evsjvh় Mvb MvB।
to_bijoy()
To convert a single Bengali string (Avro/Unicode) to Bijoy ASCII format:
bijoy_text = avro.to_bijoy(parsed)
print(bijoy_text) # Output: Avwg evsjvh় Mvb MvB।
3.a to_bijoy_iter()
To convert multiple strings at once, use to_bijoy_iter()
:
bijoy_list = avro.to_bijoy_iter(['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?'])
print(bijoy_list) # Output: ['Avwg evsjvh় Mvb MvB।', 'tvmf wkrwb‡¶ jd?']
to_unicode()
To convert a single Bijoy ASCII string back to Unicode Bengali:
unicode_text = avro.to_unicode(bijoy_text)
print(unicode_text) # Output: আমি বাংলায় গান গাই।
4.a to_unicode_iter()
For multiple strings, use to_unicode_iter()
:
unicode_list = avro.to_unicode_iter(['Avwg evsjvh় Mvb MvB।', 'tvmf wkrwb‡¶ jd?'])
print(unicode_list) # Output: ['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?']
reverse()
To reverse a single Unicode Bengali string back to Roman script:
reversed_text = avro.reverse(unicode_text)
print(reversed_text) # Output: ami banglay gan gai.
5.a reverse_iter()
To reverse multiple strings at once, use reverse_iter()
:
rev_list = avro.reverse_iter(['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?'])
print(rev_list) # Output: ['ami banglay gan gai.', 'tumi kothay jao?']
Since version
2024.12.5, the
package now supports async
/await
syntax for all the functions.
Note
Unless you have a very specific use, the asynchronous functions only provide slight performance improvements and are not necessary for most use cases, so their usage is optional.
Please have a look at the examples for a more thorough understanding of how to use the package in both synchronous and asynchronous contexts.
"Fork -> Do your changes -> Send a Pull Request, it's that easy!"
This project is based on the uv package manager by Astral. In order to automatically update and set up the environment, you can run the following command:
# (Optional) Install recommended Python version: (also sets up the virtual environment)
$ uv python install && uv venv
$ source .venv/bin/activate
# Install the project:
$ uv sync --all-extras --dev
# Build the project:
$ uv build --verbose
In order to run the tests, you can use the following command:
# Run unit tests:
$ uv run pytest .
If you come across any kind of bug or wanna request a feature, please let us know by opening an issue here. We do need more ideas to keep the project alive and running, don't we? :P
- Mehdi Hasan Khan for originally developing and maintaining Avro Phonetic.
- Rifat Nabi for porting it to Javascript.
- Sarim Khan for writing ibus-avro which helped to clarify my concepts further.
- Kaustav Das Modak for porting Rifat Nabi's JavaScript iteration to Python 2.
- Md Enzam Hossain for helping him understand the ins and outs of the Avro dictionary and the way it works.
Licensed under the MIT License.