Fluent Edge is a Flask-based web application that records live audio, transcribes it using Vosk, restores punctuation, checks grammar using LanguageTool, and calculates accuracy. The transcription and analysis results are displayed in real-time on the frontend.
- Installation
- Project Structure
- Dependencies
- Usage
- Running the Application
- Testing
- Folder and File Descriptions
To get started with Fluent Edge, follow these steps:
-
Clone this repository to your local machine:
git clone <repository_url> cd Fluent_Edge
-
Set up a virtual environment:
python -m venv venv
-
Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On macOS/Linux:
source venv/bin/activate
-
Install the required dependencies:
pip install -r requirements.txt
-
Download the required Vosk models for transcription:
- Vosk Model (US English)
- Vosk Model (Indian English)
-
Place these models in the following directories:
model/US English/ model/Indian English/
Here is a breakdown of the directory structure:
Fluent_Edge/
│
├── __pycache__/ # Compiled Python files (auto-generated)
│
├── fluent_edge_core/ # Core logic of the application
│ ├── __init__.py # Initialization for the core module
│ ├── accuracy_checker.py # Logic for calculating accuracy of transcription
│ ├── audio_handler.py # Handles audio recording and processing
│ ├── grammar_checker.py # Grammar checking functionality using LanguageTool
│ ├── punctuation_restorer.py # Restores punctuation in transcribed text
│ ├── speech_recognizer.py # Handles speech recognition using Vosk
│ └── startup_checker.py # Verifies dependencies during startup
│
├── model/ # Vosk speech recognition models
│ ├── Indian English/
│ ├── US English/
│ └── model_description.txt # Description of the models
│
├── static/ # Static files for frontend (JS, CSS)
│ ├── script.js # JavaScript file for frontend behavior
│ └── style.css # Stylesheet for the frontend
│
├── templates/ # HTML templates for the frontend
│ └── index.html # Main HTML template for the app
│
├── Testing/ # Unit tests for the application
│ ├── accuracy_test.py # Tests for the accuracy calculation
│ ├── app_test.py # Tests for the Flask app's functionality
│ ├── audio_test.py # Tests for audio handling (e.g., Vosk integration)
│ ├── error_logging_test.py # Tests for logging errors and warnings
│ ├── grammar_test.py # Tests for grammar checking
│ ├── integration_test.py # Tests for the integration of components
│ ├── mic_test.py # Tests for microphone input handling
│ ├── punctuation_test.py # Tests for punctuation restoration
│ └── vosk_test.py # Tests for the Vosk speech recognition models
│
├── venv/ # Virtual environment
├── .gitattributes # Git configuration
├── .gitignore # Git ignore file
├── app.py # Main Flask app entry point
├── requirements.txt # List of project dependencies
└── README.md # Project documentation (this file)
The project has the following dependencies:
- Flask: A web framework used for the frontend and backend integration.
- language-tool-python: A library for grammar checking using LanguageTool.
- numpy: A numerical computation library used in audio processing.
- sounddevice: A library used for audio recording.
- vosk: A speech recognition toolkit for transcribing audio.
- pytest: A testing framework to run unit tests.
These dependencies are listed in the requirements.txt
file, and can be installed with:
pip install -r requirements.txt
To start the Flask web server, run the following command:
python app.py
By default, the server will run on http://127.0.0.1:5000/
.
Once the app is running, click the "Start" button to begin recording audio. The app will transcribe the speech and display the results (including punctuation restoration and grammar check).
After speech is recorded, the transcription, grammar errors, and punctuation-restored text will be displayed in real-time.
The app calculates the accuracy of the transcription by comparing the detected grammar errors with the total number of words in the transcription.
-
Navigate to the project directory:
cd Fluent_Edge
-
Start the Flask application:
python app.py
-
Open a web browser and go to
http://127.0.0.1:5000/
to interact with the app.
To run unit tests for the application, use the pytest
framework. This will run all the tests in the Testing/
folder.
pytest
You can run specific tests as well, for example:
pytest Testing/accuracy_test.py
- accuracy_test.py: Tests the accuracy calculation logic.
- app_test.py: Tests the Flask app's routes and functionality.
- audio_test.py: Tests the audio handling functionality (e.g., recording and Vosk integration).
- error_logging_test.py: Tests the logging functionality during errors.
- grammar_test.py: Tests the grammar checker functionality.
- integration_test.py: Tests the integration of all components.
- mic_test.py: Tests the microphone input handling.
- punctuation_test.py: Tests punctuation restoration.
- vosk_test.py: Tests Vosk speech recognition models.
Contains the core logic of the application:
- accuracy_checker.py: Contains the logic for calculating transcription accuracy.
- audio_handler.py: Handles audio recording and processing.
- grammar_checker.py: Integrates with LanguageTool to check grammar.
- punctuation_restorer.py: Restores punctuation in transcribed text.
- speech_recognizer.py: Handles the Vosk speech recognition model.
- startup_checker.py: Checks if all necessary dependencies are available during startup.
Contains the Vosk models for different languages. Make sure to download and place the models in the correct directories.
Contains the JavaScript and CSS files used by the frontend.
Contains the index.html
file, which is the main template for the web interface.
Contains unit tests for different components of the project to ensure functionality.
The entry point for the Flask web application. It runs the Flask server and integrates with the backend logic.
Contains the list of dependencies for the project. Install these dependencies by running pip install -r requirements.txt
.
Git configuration file that specifies how Git handles certain files in the repository.
Specifies which files and directories Git should ignore. Common entries include __pycache__/
and venv/
.
This project is licensed under the MIT License. Feel free to fork and customize, but credit is appreciated.