A showcase implementation demonstrating how to build an intelligent Q&A system using LangGraph and Neo4j Graph Database. This project combines the power of Large Language Models (LLMs) with graph database capabilities to answer questions about movie data.
- Intelligent Movie Database Q&A: Ask natural language questions about movies, actors, directors, and genres
- Graph-based Knowledge Representation: Leverages Neo4j's graph capabilities for complex relationship queries
- Multi-stage Processing Pipeline: Uses LangGraph for dynamic orchestration of the Q&A process
- Query Validation and Self-correction: Intelligent validation and correction of generated Cypher queries
- Semantic Similarity: Uses example-based learning for improved query generation
- Smart Guardrails: Ensures the system only answers movie-related questions
- Python 3.8+
- Neo4j Database (accessible via connection string)
- OpenAI API Key
- LangSmith API Key (for tracing)
-
Clone the repository:
git clone https://github.com/yourusername/langgraph_qa_with_graph_db_showscase.git cd langgraph_qa_with_graph_db_showscase
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
# Create a .env file with the following variables OPENAI_API_KEY=your_openai_api_key LANGSMITH_API_KEY=your_langsmith_api_key LANGSMITH_TRACING=true LANGSMITH_PROJECT=GRAPH-QA NEO4J_URI=neo4j://localhost:7687 # Adjust as needed NEO4J_USERNAME=neo4j NEO4J_PASSWORD=your_password
Run the application:
python graph_qa.py
You can ask questions like:
- "Which actors played in the movie Casino?"
- "How many movies has Tom Hanks acted in?"
- "List all the genres of the movie Schindler's List"
- "Which actors have worked in movies from both the comedy and action genres?"
This system follows a multi-stage process to answer questions:
- Guardrails: Determines if the question is movie-related
- Cypher Generation: Transforms the natural language question into a Cypher query
- Validation: Checks the Cypher query for errors
- Correction: Fixes any identified errors in the query
- Execution: Runs the query against the Neo4j database
- Answer Generation: Creates a natural language answer based on database results
Input Question → Guardrails → Generate Cypher → Validate → Correct → Execute → Generate Answer
↓ ↑ ↓ ↑
End Correction ←── Validation
The movie database includes:
- Movie nodes: With properties like title, released date, and IMDB rating
- Person nodes: Representing actors and directors
- Genre nodes: Different movie genres
- Relationships: ACTED_IN, DIRECTED, IN_GENRE
Developed by extrawest. Software development company