🎓 ElaMath – Visual & Vocal Math Assistant

ElaMath is an intelligent, voice-enabled multimodal math assistant designed to help students solve and understand math problems. Simply speak your math question and optionally upload an image—such as a diagram, handwritten notes, or textbook screenshot—and ElaMath will provide clear, step-by-step explanations both in text and spoken audio

📸 Demo

🧠 AI Models Used

Multimodal LLM: meta-llama/llama-4-scout-17b-16e-instruct via Groq API
Speech Recognition: Whisper (OpenAI)
Text-to-Speech: gTTS & ElevenLabs

🛠️ Tech Stack

⚡ Groq API – High-speed inference for LLaMA models
🧠 Meta LLaMA-4 Vision Model – Multimodal math question answering
🎙️ Whisper – Converts voice to text
🔊 gTTS & ElevenLabs – Generate natural audio output
🧪 Gradio – Fast web interface for interaction

🔍 Key Features

🎙️ Voice Input: Speak your math question naturally
🖼️ Image Analysis: Upload diagrams, equations, or handwritten problems
🧠 Multimodal Reasoning: Combines voice + image to give intelligent, relevant answers
💬 Text + Voice Output: See and hear the explanation clearly
🌐 Web Interface: Lightweight, user-friendly frontend powered by Gradio

📁 Project Structure

ElaMath/ ├── gradio_app.py # Main Gradio interface ├── brain_of_the_elaMath.py # Handles multimodal LLM image + text analysis ├── voice_of_the_user.py # Speech-to-text logic using Whisper ├── voice_of_the_math_instructor.py # Text-to-speech logic using ElevenLabs/gTTS ├── .env # API keys stored securely here

⚙️ Setup Instructions

1. Clone the Repository

git clone https://github.com/iamafridi/elaMath.git cd elaMath

2. Add Environment Variables

Create a .env file and add the following:

GROQ_API_KEY=your_groq_api_key ELEVENLABS_API_KEY=your_elevenlabs_api_key

Run the Application python gradio_app.py

🧪 Example Use Case

Upload an image of a geometry problem and ask:

🗣️ “What’s the area of this triangle?”

📢 ElaMath will analyze the image and your voice, then respond with a step-by-step explanation in both text and audio.

📜 License

MIT License

👤 Author

Afridi Akbar Ifty

GitHub: @iamafridi

Portfolio: iamafrididev.netlify.app

LinkedIn: linkedin.com/in/iamafridi

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
__pycache__		__pycache__
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
brain_of_the_elaMath.py		brain_of_the_elaMath.py
elevenlabs_testing.mp3		elevenlabs_testing.mp3
elevenlabs_testing.wav		elevenlabs_testing.wav
elevenlabs_testing_autoplay.mp3		elevenlabs_testing_autoplay.mp3
final.mp3		final.mp3
final.wav		final.wav
gradio_app.py		gradio_app.py
gtts_testing.mp3		gtts_testing.mp3
gtts_testing.wav		gtts_testing.wav
gtts_testing_autoplay.mp3		gtts_testing_autoplay.mp3
requirement.txt		requirement.txt
test1.png		test1.png
user_voice_test_for_user.mp3		user_voice_test_for_user.mp3
voice_of_the_math_instructor.py		voice_of_the_math_instructor.py
voice_of_the_user.py		voice_of_the_user.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎓 ElaMath – Visual & Vocal Math Assistant

📸 Demo

🧠 AI Models Used

🛠️ Tech Stack

🔍 Key Features

📁 Project Structure

⚙️ Setup Instructions

1. Clone the Repository

2. Add Environment Variables

🧪 Example Use Case

📜 License

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Languages

iamafridi/elaMath

Folders and files

Latest commit

History

Repository files navigation

🎓 ElaMath – Visual & Vocal Math Assistant

📸 Demo

🧠 AI Models Used

🛠️ Tech Stack

🔍 Key Features

📁 Project Structure

⚙️ Setup Instructions

1. Clone the Repository

2. Add Environment Variables

🧪 Example Use Case

📜 License

👤 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages