AI-powered video clipping platform that automatically transforms long-form videos into engaging short clips optimized for social media platforms.
Vinci Clips is an open-source platform that leverages artificial intelligence to analyze video content, generate accurate transcriptions, and automatically identify the most engaging segments for creating viral short-form content. The platform streamlines the content creation workflow for creators, marketers, and businesses looking to maximize their video content's reach across multiple social media platforms.
- Intelligent Video Analysis: AI-powered content analysis using Google Gemini API
- Automatic Transcription: Speaker diarization with precise timestamp alignment
- Smart Clip Generation: AI suggests optimal clip segments based on content analysis
- Multi-Format Support: Support for major video formats with automatic conversion
- Cloud Integration: Seamless Google Cloud Storage integration for scalability
- Video-to-Audio Conversion: High-quality audio extraction using FFmpeg
- Thumbnail Generation: Automatic video thumbnail creation for quick preview
- Status Tracking: Real-time processing status with comprehensive error handling
- Batch Processing: Support for multiple video uploads with queue management
- Intuitive Dashboard: Clean, responsive interface built with Next.js and Tailwind CSS
- Drag-and-Drop Upload: Simple file upload with progress tracking (up to 2GB)
- Video Playback: Integrated video player with transcript synchronization
- Mobile Responsive: Optimized experience across desktop and mobile devices
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Frontend │ │ Backend │ │ External │
│ Next.js │◄──►│ Express API │◄──►│ Services │
│ React/TS │ │ Node.js │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Database │ │
│ │ MongoDB │ │
│ └─────────────────┘ │
│ │
│ ┌─────────────────┐ │
└──────────────│ File Storage │ │
│ Google Cloud │ │
└─────────────────┘ │
│
┌─────────────────┐ │
│ AI Services │◄───────────────┘
│ Gemini API │
└─────────────────┘
Technology Stack:
- Frontend: Next.js 15, React, TypeScript, Tailwind CSS, Shadcn/ui
- Backend: Node.js, Express.js, MongoDB with Mongoose
- AI/ML: Google Gemini API for transcription and analysis
- Media Processing: FFmpeg for video/audio conversion and manipulation
- Cloud Storage: Google Cloud Storage with signed URL access
- Infrastructure: Docker-ready with environment-based configuration
Before running Vinci Clips, ensure you have the following installed:
- Node.js (version 18.0.0 or higher)
- FFmpeg (installed and available in your system PATH)
- MongoDB (local installation or cloud instance)
Additionally, you'll need accounts and API keys for:
- Google Cloud Platform (for storage and service account)
- Google Gemini API (for AI transcription services)
-
Clone the repository
git clone https://github.com/tryvinci/vinci-clips.git cd vinci-clips
-
Install dependencies
# Install dependencies for both frontend and backend npm run install:all
-
Configure environment variables
Create a
.env
file in thebackend
directory:cd backend cp .env.example .env
Edit
backend/.env
with your actual values:# Server Configuration PORT=8080 # Database DB_URL=mongodb://localhost:27017/vinci-clips # Google Cloud Storage GCP_BUCKET_NAME=your-bucket-name GCP_SERVICE_ACCOUNT_PATH=./gcp-service-account.json # AI Services GEMINI_API_KEY=your-gemini-api-key
Note: For Docker deployment, see
docker-setup.md
for different environment configuration. -
Set up Google Cloud Storage
- Create a Google Cloud Storage bucket
- Download service account credentials JSON file
- Place the file in your backend directory
- Update
GCP_SERVICE_ACCOUNT_PATH
in your.env
file
-
Start the application
# Start both frontend and backend npm start # Or start individually: npm run start:backend # Backend on port 8080 npm run start:frontend # Frontend on port 3000
-
Access the application
Open your browser and navigate to
http://localhost:3000
- Upload Video: Drag and drop a video file (up to 2GB) onto the upload interface
- Processing: The system automatically:
- Converts video to audio format
- Uploads files to cloud storage
- Generates video thumbnails
- Creates AI-powered transcription with speaker identification
- Review Transcript: View the generated transcript with timestamp alignment
- Generate Clips: Use AI-suggested segments or manually select time ranges for clip creation
- Download Results: Access generated clips from cloud storage with direct download links
The platform provides a RESTful API for programmatic access:
// Upload a video
POST /api/upload
Content-Type: multipart/form-data
// Get transcript status
GET /api/transcripts/:id
// Generate clip
POST /api/clips/generate
{
"transcriptId": "...",
"startTime": 30,
"endTime": 90
}
For detailed API documentation, see API Reference.
vinci-clips/
├── backend/ # Express.js API server
│ ├── src/
│ │ ├── models/ # MongoDB schemas
│ │ ├── routes/ # API endpoints
│ │ └── index.js # Server entry point
│ └── package.json
├── frontend/ # Next.js application
│ ├── src/
│ │ ├── app/ # App router pages
│ │ ├── components/ # React components
│ │ └── lib/ # Utility functions
│ └── package.json
├── package.json # Root package.json for scripts
└── README.md
# Development
npm run dev # Start both services in development mode
npm run start:backend # Start backend only
npm run start:frontend # Start frontend only
# Production
npm run build # Build both applications
npm start # Start both services in production mode
# Testing
npm test # Run test suites
npm run lint # Run ESLint checks
# Backend tests
cd backend && npm test
# Frontend tests
cd frontend && npm test
# End-to-end tests
npm run test:e2e
# Build and run with Docker Compose
docker-compose up --build
# Production deployment
docker-compose -f docker-compose.prod.yml up -d
For production deployment, ensure all environment variables are properly configured:
Variable | Description | Required |
---|---|---|
PORT |
Backend server port | No (default: 8080) |
DB_URL |
MongoDB connection string | Yes |
GCP_BUCKET_NAME |
Google Cloud Storage bucket | Yes |
GCP_SERVICE_ACCOUNT_PATH |
Path to GCS service account JSON | Yes |
GEMINI_API_KEY |
Google Gemini API key | Yes |
We welcome contributions to Vinci Clips! Please see our Contributing Guidelines for details on:
- Code of conduct
- Development workflow
- Pull request process
- Issue reporting guidelines
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Make your changes
- Add tests for new functionality
- Ensure all tests pass (
npm test
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Video upload with drag-and-drop interface (2GB limit)
- FFmpeg-based video processing and thumbnail generation
- Google Cloud Storage integration with signed URLs
- AI transcription using Google Gemini API with speaker diarization
- MongoDB data persistence with comprehensive status tracking
- React/Next.js frontend with responsive design
- Basic clip generation from transcript segments
- TikTok/Reels style caption generation with 5 popular styles
- SRT-based FFmpeg subtitle rendering
- Word-level timestamp conversion from segment data
- Caption preview integration in reframe workflow
High Priority
- Enhanced word-level timestamp precision (Issue #19)
- Advanced caption styles based on social media research (Issue #20)
- Real-time caption preview with video overlay (Issue #21)
Medium Priority
- Intelligent reframing with subject detection (Issue #22)
- Smooth camera movement for reframed videos (Issue #23)
- Smart fallback mechanisms for complex scenarios (Issue #24)
Future Enhancements
- Speaker-aware caption positioning (Issue #25)
- LLM-enhanced clip suggestion engine (Issue #26)
- Performance caching for transcripts and ML models (Issue #27)
- Modular caption style plugin system (Issue #28)
See GitHub Issues for detailed technical specifications and implementation plans.
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). See the LICENSE file for details.
The AGPL-3.0 license ensures that any modifications or derivatives of this software, including those running on servers, must also be made available under the same license terms.
- GitHub Issues: Report bugs or request features
- Discussions: Join community discussions
- Documentation: Read the full documentation
For enterprise deployments, custom development, or commercial licensing options, please contact us at support@tryvinci.com.
- Google Gemini API for powerful AI transcription capabilities
- FFmpeg for reliable video processing
- Next.js and Vercel for excellent development experience
- MongoDB for flexible data storage
- Open Source Community for inspiration and contributions
Built by the Vinci team. Made possible by the open source community.