Skip to content

AudioWrite: Effortless voice dictation powered by Google's Gemini API. Record, transcribe, and transform rambling audio into polished, multi-language notes. PWA ready.

Notifications You must be signed in to change notification settings

hoomanick/AudioWrite

Repository files navigation

AudioWrite ✨

Effortless dictation powered by Gemini. Turn rambling voice recordings into perfectly transcribed and polished notes.

Gemini API PWA Ready License

AudioWrite is a client-side web application that leverages Google's Gemini AI to transform your voice recordings into accurate transcriptions and then refines them into well-structured, polished notes.

🚀 Live Demo

Try AudioWrite Live!

🌟 Key Features

  • 🗣️ Voice Recording & Dictation: Record audio directly in your browser.
  • 🧠 AI-Powered Transcription: Fast and accurate speech-to-text using Gemini.
  • 📝 AI-Powered Note Polishing: Gemini refines raw transcriptions into clean, Markdown-formatted notes.
  • 🌐 Multi-language Output: Select the output language for polished notes.
  • Customizable Polishing Prompts: Guide the AI with specific instructions.
  • 📋 Copy to Clipboard: Easily copy raw or polished notes.
  • 👁️ Live Audio Waveform: Visual feedback during recording.
  • 🎯 Focus Mode: Minimalist UI to help you concentrate.
  • 💾 Local Storage: Notes are saved persistently in your browser.
  • 🔑 Session-Based API Key: Securely handles your API key.
  • 🎨 Dark & Light Themes: Switch to your preferred mode.
  • 🗂️ Note Archive: Manage, load, re-polish, and delete notes.
  • 📱 Responsive Design: Works on desktop and mobile.
  • PWA Progressive Web App: Installable for an app-like experience with offline asset caching.

🛠️ Core Technologies

  • Frontend: HTML5, CSS3, TypeScript
  • AI: Google Gemini API (@google/genai)
  • Markdown Rendering: marked
  • PWA: vite-plugin-pwa for Service Worker generation and manifest handling.
  • Storage: Browser Local Storage (notes) & Session Storage (API key)
  • Build Tool: Vite

⚙️ Getting Started

Prerequisites

  • A modern web browser (e.g., Chrome, Firefox, Safari, Edge).
  • Your own Google Gemini API Key. You can obtain one from Google AI Studio.

Running Locally (after cloning)

If you want to run a local copy after cloning the repository:

  1. Clone the repository:

    git clone https://github.com/hoomanick/AudioWrite.git
    cd AudioWrite
  2. Install dependencies:

    npm install
  3. Run the development server:

    npm run dev

    This will usually open the app in your browser at http://localhost:5173 (or a similar port).

  4. Set Your API Key:

    • Click the Settings icon (🔑) in the app.
    • Enter your Gemini API Key and click "Save & Apply Key".
    • This key is required for all AI features. It's stored in sessionStorage (cleared when you close the browser tab/window).

📖 How to Use

  1. Set API Key: If not already done, provide your Gemini API Key via Settings (🔑).
  2. Record: Click the microphone button (🎙️). Grant microphone permission if prompted.
  3. Speak: Dictate your content.
  4. Stop: Click the stop button (⏹️).
  5. Review & Edit:
    • The Polished Note is shown by default. Use the "Copy Polished" button to copy its content.
    • Switch to Raw Transcription using the tabs. Use the "Copy Raw" button for its content.
    • Edit the note title, raw transcription, or polished content directly. Changes save automatically.
  6. Customize (Optional):
    • Output Language: Select your desired language for the polished note.
    • Custom Prompt (✨): Provide specific instructions to the AI for note polishing (e.g., "Summarize in 3 bullet points," "Adopt a formal tone").
  7. Manage Notes:
    • New Note (📄): Creates a blank note.
    • Archive (🗄️): View, load, re-polish, or delete saved notes.
    • Theme (☀️/🌙): Toggle light/dark mode.

🌍 PWA (Progressive Web App)

  • Installable: On supported devices, install AudioWrite for an app-like experience.
  • Offline Access: The app shell and previously saved notes (from local storage) are accessible offline. Core app assets are cached by the service worker generated by vite-plugin-pwa.
  • Note: AI features require an active internet connection and a valid API key for the session.

🚀 Deployment

AudioWrite is built using Vite and deployed as a static website to GitHub Pages.

🙌 Contributing

Contributions are welcome! Please feel free to fork the project, create a feature branch, commit your changes, and open a Pull Request.

📜 License

This project is licensed under the Apache License 2.0. See the SPDX-License-Identifier in index.tsx or visit Apache License 2.0.

🙏 Acknowledgements

  • Created by Hooman Nick.
  • Powered by the Google Gemini API.
  • Uses Marked.js for Markdown rendering and Font Awesome for icons.
  • Built with Vite and vite-plugin-pwa.

About

AudioWrite: Effortless voice dictation powered by Google's Gemini API. Record, transcribe, and transform rambling audio into polished, multi-language notes. PWA ready.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published