A web application that allows users to visualize and split audio files based on speaker segments. It provides an interactive waveform visualization and enables downloading individual speaker segments.
- 🎵 Interactive waveform visualization
- 👥 Speaker-based audio segmentation
- 🎨 Unique color coding for each speaker
- ⬇️ Download individual speaker segments
- 🌓 Dark/Light theme support
- ⏯️ Click-to-play segments
- 📱 Responsive design
- Next.js 14 with App Router
- TypeScript
- Tailwind CSS
- WaveSurfer.js
- Material Design Color System
- FastAPI
- Python 3.8+
- pydub for audio processing
- Pydantic for data validation
You can run the application either using Docker or by setting up the development environment locally.
- Clone the repository:
git clone https://github.com/yourusername/audio-splitter.git
cd audio-splitter
-
Make sure Docker and Docker Compose are installed on your system
-
Start the application:
docker-compose up
The application will be available at:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
Frontend:
- Node.js 18.17 or later
- npm or yarn
Backend:
- Python 3.8 or later
- pip
- FFmpeg (for audio processing)
- Navigate to the frontend directory:
cd frontend
- Install frontend dependencies:
npm install
- Create a
.env.local
file:
NEXT_PUBLIC_API_URL=http://localhost:8000
- Run the development server:
npm run dev
- Navigate to the backend directory:
cd backend
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install fastapi uvicorn pydub requests python-multipart
- Run the backend server:
uvicorn main:app --reload
audio-splitter/
├── frontend/
│ ├── src/
│ │ ├── app/
│ │ │ └── page.tsx
│ │ ├── components/
│ │ │ ├── AudioSplitter.tsx
│ │ │ └── WaveformPlayer.tsx
│ │ └── contexts/
│ │ └── ThemeContext.tsx
│ └── public/
│ └── screenshot.png
│
├── backend/
│ ├── main.py
│ ├── audio_processor.py
│ └── models.py
│
└── README.md
from pydantic import BaseModel
from typing import List
class Segment(BaseModel):
start: float
end: float
speaker: str
text: str
class TranscriptionRequest(BaseModel):
audio_url: str
segments: List[Segment]
Handles audio file processing:
- Downloading audio from URL
- Splitting audio based on speaker segments
- Combining segments per speaker
- Converting to MP3 format
Provides the REST API endpoints:
- POST
/split-audio/{speaker}
- Splits audio by speaker
POST /split-audio/{speaker}
Content-Type: application/json
{
"audio_url": "https://example.com/audio.mp3",
"segments": [
{
"start": 0.0,
"end": 2.5,
"speaker": "A",
"text": "Hello, how are you?"
}
]
}
Response: MP3 file containing the speaker's segments
- Enter an audio URL in the input field
- Paste the JSON transcription data with speaker segments
- The waveform will display with color-coded regions for each speaker
- Click on segments to play specific portions
- Use the download button to get individual speaker audio files
The transcription data should follow this format:
[
{
"start": 0.0,
"end": 2.5,
"speaker": "A",
"text": "Hello, how are you?"
},
{
"start": 2.5,
"end": 5.0,
"speaker": "B",
"text": "I'm doing well, thank you!"
}
]
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
-
Audio processing fails
- Ensure FFmpeg is installed and accessible in your PATH
- Verify the audio URL is accessible
- Check the audio format is supported
-
CORS errors
- Verify the frontend URL is listed in the backend's CORS configuration
- Check that credentials are properly handled
-
JSON parsing errors
- Ensure the transcription JSON matches the expected format
- Validate the timestamps are within the audio duration
-
Docker-related issues
- Ensure both Docker and Docker Compose are installed and up to date
- Check if ports 3000 and 8000 are available on your system
- If volumes aren't updating, try rebuilding the containers:
docker-compose down docker-compose up --build
- For Windows users, ensure Docker Desktop is running with WSL 2 backend
This project is licensed under the MIT License - see the LICENSE file for details.