|
1 |
| -# NLP TTS System in Hindi |
2 | 1 |
|
3 |
| -This is a concatenative text-to-speech engine implemented in Python. |
4 |
| - |
5 |
| -I started this project in January 2015 because I was intrigued by text-to-speech software and wanted to learn more about how it worked. |
6 |
| - |
7 |
| -A **concatenative** text-to-speech engine creates an audio representation of text by pasting together a bunch of small audio files to form the whole of the output. |
| 2 | +* **Text-to-words**, where raw input text is tokenized into an list of words. This also generally includes converting numerical digits into their word equivalents (ex: turn "5" into "five"). |
| 3 | +* **Words-to-phonemes**, where the array of words is converted into phonemes. Phonemes are the individual sounds in a language. As Hindi has a very vast phonetic genre, the hindi alphabetic pronunciation can vary change the pronunciation of the whole word. System has already mapped the hindi phonetic sounds to their alphabets, so whenever the alphabet is detected, the system just maps to its audio file and return its number. The output is an list of numbers that each correspond to one of the 44 hindi phonemes. |
| 4 | +* **Phonemes-to-sounds**, where each phoneme is paired with an audio file. This is the point where the actual audio is stitched together. It would also be in this step that the correct voice for the audio is selected, assuming multiple voices are supported. |
8 | 5 |
|
9 |
| -There are three steps, including: |
| 6 | +## Future Work |
| 7 | +Right now the system has only one voice, that is mine and my project buddy's [@Raj2503][1]. We would love to add a few more voices to it. |
| 8 | +Also the Phoneme database is limited to only 44 audio files (at the time I am writing this) covering all the major and frequently used Hindi Alphabets. |
| 9 | +So we need to build a bigger Phonetic voice database. |
10 | 10 |
|
11 |
| -* **Text-to-words**, where raw input text is converted into an array of words. This also generally includes converting numerical digits into their word equivalents (ex: turn "5" into "five"). |
12 |
| -* **Words-to-phonemes**, where the array of words is converted into phonemes. Phonemes are the individual sounds in a language. As English is not a phonetic language, the pronunciation of a word can vary drastically from its pronunciation. This problem is solved by looking up the pronunciation for a word in a CSV file. The output is an array of numbers that each correspond to one of the 44 English phonemes. |
13 |
| -* **Phonemes-to-sounds**, where each phoneme is paired with an audio file. This is the point where the actual audio is stitched together. It would also be in this step that the correct voice for the audio is selected, assuming multiple voices are supported. |
| 11 | +Do contribute to it if you can. It will really help us and make this project grow. |
14 | 12 |
|
15 | 13 | ## Dependencies
|
16 |
| -This project relies on Python 3.x. |
| 14 | +This project relies on |
| 15 | +* Python 3x. |
| 16 | +* re (for tokenization) |
| 17 | +* wave and os (for stiching together the audio files) |
17 | 18 |
|
18 | 19 | ## Installation
|
19 | 20 | Follow the steps below to try the speech synthesizer out.
|
20 | 21 |
|
21 | 22 | 1. Make sure that all dependencies are installed.
|
22 |
| -2. Clone or download this repository to your local machine. |
23 |
| -3. Open a terminal and navigate to the cloned directory. |
24 |
| -4. Run: `pip install -r requirements.txt` |
25 |
| -5. Run the command `python3 synthme.py` |
26 |
| -6. You will be prompted for a message. Enter what you want the engine to say! |
27 |
| -7. The program will end. Open output.wav to hear the result. |
| 23 | +2. Open a terminal and navigate to the cloned directory. |
| 24 | +3. Run: `pip install -r requirements.txt` |
| 25 | +4. Run the command `**python3 SpeechSynthesis.py**` |
| 26 | +5. You will be prompted to input a message. Enter what you want the system to say for you! |
| 27 | +6. The program will generate the output as a .wav file and end. Open output.wav to hear the result. |
28 | 28 |
|
| 29 | +[1]:https://github.com/Raj2503 |
0 commit comments