A question-answering system that uses Cognee's knowledge graph technology to answer questions about the Harry Potter universe. This application processes the text from the Harry Potter books, builds a knowledge graph, and allows for semantic search and question-answering about the content.
graph TD
A[User Interface] -->|HTTP Requests| B[FastAPI Server]
B --> C[Knowledge Graph Manager]
C --> D[Cognee Engine]
D --> E[Vector Database]
B --> F[OpenAI API]
subgraph "Client-Side"
A
end
subgraph "Server-Side"
B
C
D
E
F
end
style A fill:#f9f,stroke:#333,stroke-width:2px
style B fill:#bbf,stroke:#333,stroke-width:2px
style C fill:#f96,stroke:#333,stroke-width:2px
style D fill:#6f9,stroke:#333,stroke-width:2px
style E fill:#9cf,stroke:#333,stroke-width:2px
style F fill:#f99,stroke:#333,stroke-width:2px
-
User Interface
- Built with HTML, CSS, and JavaScript
- Handles user interactions and displays responses
- Communicates with the backend via REST API
-
FastAPI Server
- Handles HTTP requests and responses
- Routes requests to appropriate handlers
- Manages authentication and API key validation
-
Knowledge Graph Manager
- Orchestrates interactions with the knowledge graph
- Processes natural language queries
- Formats responses for the client
-
Cognee Engine
- Processes and indexes text data
- Builds and maintains the knowledge graph
- Handles semantic search operations
-
Vector Database
- Stores vector embeddings of text chunks
- Enables efficient similarity search
- Maintains relationships between entities
-
OpenAI API
- Provides language model capabilities
- Generates embeddings for text
- Assists in natural language understanding and generation
sequenceDiagram
participant User
participant UI as User Interface
participant API as FastAPI Server
participant KGM as Knowledge Graph Manager
participant CE as Cognee Engine
participant DB as Vector Database
participant OA as OpenAI API
%% Initialization Phase
User->>UI: Enters question and submits
activate UI
UI->>API: POST /api/ask {question: "..."}
activate API
%% Knowledge Graph Processing
API->>KGM: process_question(question)
activate KGM
%% Context Retrieval
KGM->>CE: find_relevant_context(question)
activate CE
CE->>DB: search_similar_embeddings(question_embedding)
DB-->>CE: Return top N relevant chunks
CE-->>KGM: Return context snippets
%% Answer Generation
KGM->>OA: generate_answer(question, context)
activate OA
OA-->>KGM: Generated answer
deactivate OA
%% Response Formulation
KGM-->>API: {"answer": "...", "sources": [...]}
deactivate KGM
%% Response Delivery
API-->>UI: 200 OK (JSON response)
deactivate API
%% UI Update
UI->>UI: Update chat interface
UI-->>User: Display answer with sources
deactivate UI
-
User Submission
- User enters a question in the web interface
- UI sends an HTTP POST request to the FastAPI server
-
Request Processing
- FastAPI validates the request and extracts the question
- Request is forwarded to the Knowledge Graph Manager
-
Context Retrieval
- Knowledge Graph Manager uses Cognee Engine to find relevant context
- Cognee queries the Vector Database for similar text chunks
- Most relevant context snippets are returned
-
Answer Generation
- Context and question are sent to OpenAI API
- OpenAI generates a natural language answer
-
Response Formulation
- Answer is formatted with source references
- Response is sent back through the chain
-
UI Update
- Web interface updates to show the answer
- Sources are displayed for reference
- Processes and indexes Harry Potter book text
- Builds a knowledge graph of characters, locations, and events
- Enables semantic search across the book content
- Provides a simple interface for asking questions about the Harry Potter universe
demo.mp4
The application uses the complete text from all seven Harry Potter books as its knowledge base. The text is processed and indexed using Cognee, an open-source framework for building and managing knowledge graphs. For more technical details about the underlying technology, you can read the Cognee research paper.
The text is processed and indexed to build a comprehensive knowledge graph of the Harry Potter universe.
- Python 3.8+
- Node.js (for development)
- OpenAI API key
- Python 3.8 or higher
- Git
- OpenAI API key (get one from OpenAI)
-
Clone the repository
git clone https://github.com/hithesh-mr/harry-potter-qna-with-cognee.git cd harry-potter-qna-with-cognee
-
Set up Python environment
# Create and activate virtual environment python -m venv .venv # On Windows: .venv\Scripts\activate # On macOS/Linux: # source .venv/bin/activate
-
Install dependencies
# Install core requirements pip install -r requirements.txt # Install Cognee SDK (this might take a few minutes) pip install cognee==0.1.39 # Install additional required packages pip install openai python-dotenv fastapi uvicorn # If you encounter SSL errors, install certifi: # pip install certifi
-
Configure environment variables Create a
.env
file in the root directory with your OpenAI API key:LLM_API_KEY=your_openai_api_key_here
-
Run the application
# Start the FastAPI server uvicorn server.app:app --reload
The server will start on
http://127.0.0.1:8000
-
Initialize the knowledge graph In a new terminal, run:
curl -X POST http://127.0.0.1:8000/api/initialize
This will start building the knowledge graph from the Harry Potter books. The first-time initialization may take several minutes.
-
Access the application Open your web browser and navigate to
http://localhost:8000
- The API documentation will be available at
http://localhost:8000/docs
- The knowledge graph visualization will be available at
http://localhost:8000/graph
- The API documentation will be available at
classDiagram
class FastAPI_App {
+FastAPI app
+APIRouter api_router
+add_middleware()
+include_router()
+startup_event()
}
class KnowledgeGraphManager {
+bool is_initialized
+bool is_initializing
+int initialization_progress
+load_and_cognify(Path data_dir) Dict~str, Any~
+get_status() Dict~str, Any~
+search_knowledge_graph(str question) Dict~str, Any~
}
class QuestionRequest {
+str question
}
class AnswerResponse {
+str answer
+List~Dict~str, str~~ sources
}
class AskRouter {
+APIRouter router
+ask_question(QuestionRequest request) Dict~str, Any~
}
class CogneeIntegration {
+search(query_type, query_text) Any
+process_text(str text) Any
}
class OpenAI_Client {
+generate_completion(str prompt) str
+create_embeddings(str text) List~float~
}
FastAPI_App --> KnowledgeGraphManager : manages
FastAPI_App --> AskRouter : includes
AskRouter --> QuestionRequest : uses
AskRouter --> AnswerResponse : returns
AskRouter --> KnowledgeGraphManager : queries
KnowledgeGraphManager --> CogneeIntegration : uses
CogneeIntegration --> OpenAI_Client : depends on
-
FastAPI_App
- Main application entry point
- Configures middleware and routes
- Manages application lifecycle
-
KnowledgeGraphManager
- Manages the knowledge graph lifecycle
- Handles initialization and status checks
- Coordinates search operations
-
QuestionRequest/AnswerResponse
- Data models for API request/response
- Ensure type safety and validation
-
AskRouter
- Handles question-answering endpoint
- Manages request/response flow
- Integrates with KnowledgeGraphManager
-
CogneeIntegration
- Wraps Cognee functionality
- Handles text processing and search
- Manages vector database interactions
-
OpenAI_Client
- Handles communication with OpenAI API
- Manages API key and rate limiting
- Processes text generation requests
- Data Ingestion: The system processes the complete text of all seven Harry Potter books
- Text Processing: Text is cleaned, tokenized, and split into meaningful chunks
- Embedding Generation: Text chunks are converted to vector embeddings using OpenAI's API
- Graph Construction: Relationships between entities are established to form a knowledge graph
- The user submits a question through the web interface
- The question is sent to the FastAPI backend
- The system searches the knowledge graph for relevant context
- The context and question are sent to OpenAI's API to generate an answer
- The answer is formatted and returned to the user
harry-potter-qna-with-cognee/
βββ client/ # Frontend code
β βββ index.html # Main HTML file
β βββ styles.css # CSS styles
β βββ scripts.js # Frontend JavaScript
β βββ logos/ # Image assets
βββ server/ # Backend code
β βββ app.py # FastAPI application
β βββ ask.py # Question handling
β βββ knowledge_graph.py # Knowledge graph management
β βββ __init__.py
βββ data/ # Data files
β βββ combined_harry_potter.txt
β βββ original/ # Original book texts
βββ requirements.txt # Python dependencies
βββ README.md # This file
GET /api/status
: Check the status of the knowledge graphPOST /api/ask
: Submit a question- Request body:
{"question": "Your question here"}
- Response:
{"answer": "...", "sources": [...]}
- Request body:
- Initialization: The knowledge graph takes time to initialize (5-10 minutes) as it processes all seven books
- Response Time: Typical response time is 2-5 seconds depending on query complexity
- Caching: Recent queries are cached to improve performance
- API keys are never exposed to the client
- All communications are encrypted (HTTPS)
- Rate limiting is implemented to prevent abuse
This project is licensed under the MIT License - see the LICENSE file for details.
- J.K. Rowling for the Harry Potter series
- OpenAI for their powerful language models
- The Cognee team for their knowledge graph technology
- Ensure your virtual environment is activated
- Run the application:
python server/app.py
- Access the web interface at
http://localhost:8000
data/
: Contains the Harry Potter book text filesserver/
: Backend code for processing and querying the knowledge graphclient/
: Frontend web interface (if applicable)playbook/
: Jupyter notebooks for experimentation and development
- If you encounter disk I/O errors, try deleting the
.cognee_system
directory and restarting the application - Ensure your OpenAI API key has sufficient credits and is properly set in the
.env
file - Check the logs for specific error messages if the application fails to start