Effortless dictation powered by Gemini. Turn rambling voice recordings into perfectly transcribed and polished notes.
AudioWrite is a client-side web application that leverages Google's Gemini AI to transform your voice recordings into accurate transcriptions and then refines them into well-structured, polished notes.
- 🗣️ Voice Recording & Dictation: Record audio directly in your browser.
- 🧠 AI-Powered Transcription: Fast and accurate speech-to-text using Gemini.
- 📝 AI-Powered Note Polishing: Gemini refines raw transcriptions into clean, Markdown-formatted notes.
- 🌐 Multi-language Output: Select the output language for polished notes.
- ✨ Customizable Polishing Prompts: Guide the AI with specific instructions.
- 📋 Copy to Clipboard: Easily copy raw or polished notes.
- 👁️ Live Audio Waveform: Visual feedback during recording.
- 🎯 Focus Mode: Minimalist UI to help you concentrate.
- 💾 Local Storage: Notes are saved persistently in your browser.
- 🔑 Session-Based API Key: Securely handles your API key.
- 🎨 Dark & Light Themes: Switch to your preferred mode.
- 🗂️ Note Archive: Manage, load, re-polish, and delete notes.
- 📱 Responsive Design: Works on desktop and mobile.
- PWA Progressive Web App: Installable for an app-like experience with offline asset caching.
- Frontend: HTML5, CSS3, TypeScript
- AI: Google Gemini API (
@google/genai
) - Markdown Rendering:
marked
- PWA:
vite-plugin-pwa
for Service Worker generation and manifest handling. - Storage: Browser Local Storage (notes) & Session Storage (API key)
- Build Tool: Vite
- A modern web browser (e.g., Chrome, Firefox, Safari, Edge).
- Your own Google Gemini API Key. You can obtain one from Google AI Studio.
If you want to run a local copy after cloning the repository:
-
Clone the repository:
git clone https://github.com/hoomanick/AudioWrite.git cd AudioWrite
-
Install dependencies:
npm install
-
Run the development server:
npm run dev
This will usually open the app in your browser at
http://localhost:5173
(or a similar port). -
Set Your API Key:
- Click the Settings icon (🔑) in the app.
- Enter your Gemini API Key and click "Save & Apply Key".
- This key is required for all AI features. It's stored in
sessionStorage
(cleared when you close the browser tab/window).
- Set API Key: If not already done, provide your Gemini API Key via Settings (🔑).
- Record: Click the microphone button (🎙️). Grant microphone permission if prompted.
- Speak: Dictate your content.
- Stop: Click the stop button (⏹️).
- Review & Edit:
- The Polished Note is shown by default. Use the "Copy Polished" button to copy its content.
- Switch to Raw Transcription using the tabs. Use the "Copy Raw" button for its content.
- Edit the note title, raw transcription, or polished content directly. Changes save automatically.
- Customize (Optional):
- Output Language: Select your desired language for the polished note.
- Custom Prompt (✨): Provide specific instructions to the AI for note polishing (e.g., "Summarize in 3 bullet points," "Adopt a formal tone").
- Manage Notes:
- New Note (📄): Creates a blank note.
- Archive (🗄️): View, load, re-polish, or delete saved notes.
- Theme (☀️/🌙): Toggle light/dark mode.
- Installable: On supported devices, install AudioWrite for an app-like experience.
- Offline Access: The app shell and previously saved notes (from local storage) are accessible offline. Core app assets are cached by the service worker generated by
vite-plugin-pwa
. - Note: AI features require an active internet connection and a valid API key for the session.
AudioWrite is built using Vite and deployed as a static website to GitHub Pages.
Contributions are welcome! Please feel free to fork the project, create a feature branch, commit your changes, and open a Pull Request.
This project is licensed under the Apache License 2.0. See the SPDX-License-Identifier
in index.tsx
or visit Apache License 2.0.
- Created by Hooman Nick.
- Powered by the Google Gemini API.
- Uses Marked.js for Markdown rendering and Font Awesome for icons.
- Built with Vite and
vite-plugin-pwa
.