RAG Document

A FastAPI application that demonstrates Retrieval-Augmented Generation (RAG) concepts, combining:

Audio/Video transcription via AssemblyAI,
Semantic search over text sections with FAISS,
Optional usage of LLaMA-based embeddings or a fake embeddings class.

Features

Upload or record audio/video, transcribe with AssemblyAI, and get SRT/VTT subtitles.
Semantic Search over the transcribed text or your own documents (markdown-based).
Modular design following clean code principles, with separate classes for embeddings, search, and server.
Flexible Embeddings with support for llama.cpp or custom embedding providers.

Installation

Clone this repository:

git clone https://github.com/boringresearch/rag-demo.git
cd rag-demo

Create and activate a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # on Linux/Mac
.\venv\Scripts\activate   # on Windows

Install requirements (python<=3.11):
```
pip install -r requirements.txt
```
Configure environment:
- Copy .env.example to .env
- Insert your ASSEMBLYAI_API_KEY inside .env
- Configure your embedding API URL in .env if using a custom embedding service
Run the application:
```
python src/main.py
```
- Server listens on http://localhost:8002

Usage

Open your browser at http://localhost:8002.
Upload an audio/video file or choose an example to see transcription.
Type a query in the search box to perform semantic search over the transcribed content or custom text.

Embeddings with llama.cpp

This project uses llama.cpp for generating embeddings by default. To set up the embedding server:

Install llama.cpp following the instructions in their repository
Download a compatible model (e.g., a GGUF format model)

Run the llama-server with embeddings enabled:

./llama-server -m model-f16.gguf --embeddings -c 512 -ngl 99 --host 0.0.0.0

Update your .env file with the correct embedding API URL (default: http://localhost:8080)

llama.cpp Embedding API Format

The llama.cpp server expects embedding requests in the following format:

POST /embedding
{
    "content": "text to embed"
}

The response will contain the embedding vector:

{
    "embedding": [0.123, 0.456, ...]
}

Troubleshooting Embeddings

If you encounter issues with the embedding API:

Check that the llama-server is running with the --embeddings flag
Verify the API URL in your .env file matches the server address

Test the API directly using curl:

curl -X POST http://localhost:8080/embedding \
  -H "Content-Type: application/json" \
  -d '{"content":"test text"}'

Check server logs for any error messages
Try using the FakeEmbeddings provider for testing by setting EMBEDDING_PROVIDER=fake in your .env file

Using Alternative Embedding Providers

The project is designed to make it easy to switch between different embedding providers:

Create a new class that implements the EmbeddingsBase interface in src/embeddings/
Update the TermsSearchEngine initialization in src/server/app.py to use your custom embeddings class
Alternatively, set the EMBEDDING_PROVIDER environment variable to switch between implemented providers

Project Structure

rag-demo/
├── LICENSE
├── README.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── requirements.txt
├── .env.example
├── src/
│   ├── main.py
│   ├── server/
│   │   ├── __init__.py
│   │   └── app.py
│   ├── embeddings/
│   │   ├── __init__.py
│   │   ├── base.py
│   │   ├── fake.py
│   │   └── llama.py
│   ├── search/
│   │   ├── __init__.py
│   │   └── terms_search_engine.py
│   └── templates/
│       └── index.html
├── static/
├── cache/
├── examples/
└── uploads/

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Document

Features

Installation

Usage

Embeddings with llama.cpp

llama.cpp Embedding API Format

Troubleshooting Embeddings

Using Alternative Embedding Providers

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
examples		examples
src		src
static		static
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
noterms.md		noterms.md
requirements.txt		requirements.txt

License

boringresearch/rag-demo

Folders and files

Latest commit

History

Repository files navigation

RAG Document

Features

Installation

Usage

Embeddings with llama.cpp

llama.cpp Embedding API Format

Troubleshooting Embeddings

Using Alternative Embedding Providers

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages