A retrieval-augmented generation (RAG) API for querying codebases using natural language. Weave ingests code repositories, creates vector embeddings, and provides an API to ask questions about the codebase with AI-powered answers.
Setup → Ingest → Query
-
Start the services:
docker-compose up
-
Ingest your code:
# Ingest any codebase uv run -m src.ingest.cli /path/to/your/repo
-
Query your code:
curl -X POST "http://localhost:8000/api/v1/query" \ -H "Content-Type: application/json" \ -d '{"question": "How is Strava'\''s API used?", "limit": 5}'
📋 API docs: http://localhost:8000/docs
{
"answer": "The Strava API is used through a repository pattern with OAuth authentication flow. The StravaTokenRepo handles token refresh, and StravaActivitiesRepo fetches activities by ID or year with pagination... [truncated]",
"sources": [
"application/usecases/_update_summary.py",
"adapters/strava/__init__.py",
"domain.py",
"adapters/strava/_repositories.py"
],
"chunks": [
{
"content": " def refresh(self) -> StravaTokenSet:\n payload = {\n \"client_id\": self._tokens.client_id,\n ...\n }\n resp = requests.post(url=self._url, data=payload, timeout=10)\n ... [truncated]",
"file_path": "adapters/strava/_repositories.py",
"chunk_type": "function",
"start_line": 23,
"end_line": 42,
"function_name": "refresh",
"class_name": null,
"distance": 0.58
}
],
"processing_time_ms": 11784.0
}
GET /health
- Check API health and database connectionGET /stats
- Get codebase ingestion statistics
- uv - Python package manager
- Docker and Docker Compose
- Python 3.11+
⚠️ Required: You need OpenAI and Anthropic API keys
-
Setup environment:
cp .env.example .env # Add your API keys to .env file
-
Install & start:
uv sync --dev && make start
Environment variables and their usage:
Variable | Description | Required | Default |
---|---|---|---|
DATABASE_URL |
PostgreSQL connection string | Yes | Set by docker-compose |
OPENAI_API_KEY |
OpenAI API key for text embeddings | Yes | - |
OPENAI_PROJECT_ID |
OpenAI project identifier | Yes | - |
ANTHROPIC_API_KEY |
Anthropic API key for generating responses | Yes | - |
CORS_ORIGINS |
Comma-separated allowed CORS origins | No | http://localhost:3000,http://127.0.0.1:3000 |
API Usage Notes:
- OpenAI: Used for generating text embeddings during ingestion and querying
- Anthropic: Used for generating natural language responses to user questions
- Rate Limits: Consider API rate limits when ingesting large codebases
# Essential commands
make start # Start services
make test-local # Run tests with coverage
make lint # Check code quality
# Run API directly (for development)
uv run -m src.api.cli --reload
# All commands: build stop logs clean format typecheck test
Uses domain-driven architecture - code organized around business domains rather than technical layers.
src/api/
- FastAPI application and routessrc/weave/
- Core business logiccore/
- Database and client utilitiesembedding/
- Text embedding servicesingestion/
- Code parsing and ingestionquery/
- RAG query processingvector/
- Vector database operations
flyway/
- Database migrationstests/
- Unit tests with pytestdocs/
- Additional documentation
Run tests with coverage reporting:
make test-local
This generates:
- Terminal coverage report showing missing lines
- HTML coverage report in
htmlcov/
directory
Key patterns:
- Mock external dependencies (database, APIs) with
@patch
- Test both success and error cases
- Use descriptive assertion messages to explain expected behavior
- Group related assertions for the same operation