Skip to content

A fully local RAG pipeline that answers natural language questions about movie reviews. Uses Ollama for embeddings + local LLM, Chroma for the vector store, and LangChain to retrieve, summarize, and generate answers.

Notifications You must be signed in to change notification settings

DavidShableski/llm-movie-review

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Movie Review Q&A

YouTube Demo

Watch on YouTube

A fully local retrieval-augmented generation (RAG) pipeline that answers natural language questions about movie reviews.
Runs entirely offline using:

  • Ollama for local embeddings & LLM inference
  • Chroma for the vector database
  • LangChain for retrieval + prompt orchestration
  • Rich for clean terminal output

Features

✅ Loads movie reviews from a CSV file into a local Chroma vector store, using Ollama to create embeddings.

✅ Dynamically decides how many reviews to pull based on your question:

  • Small, direct questions pull just 3 documents.
  • Broad or comparative questions pull up to 10.

✅ Summarizes the retrieved reviews in bullet points, then answers your question with a local language model.

✅ Displays everything in a clean, color-coded terminal interface.

✅ Shows the most frequent words from the matched reviews to give quick analytic insights.


How it works

vector.py Loads movie_reviews.csv, embeds each review, and builds a Chroma vector database. Makes a retriever you can import anywhere.

main.py Sets up a multi-step pipeline with LangChain:

Retrieves relevant reviews

Summarizes them

Answers your question based on that summary

Highlights top keywords with nltk

rich makes it look polished in the terminal.

Install & run bash Copy Edit pip install -r requirements.txt python vector.py # Builds your local vector DB python main.py # Run the Q&A app Why this matters This small project shows how to build a full local RAG system from scratch, combining embedding generation, semantic retrieval, prompt chaining, and lightweight data analysis — all without relying on external APIs.

About

A fully local RAG pipeline that answers natural language questions about movie reviews. Uses Ollama for embeddings + local LLM, Chroma for the vector store, and LangChain to retrieve, summarize, and generate answers.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages