The Context-Aware Knowledge Extraction framework is designed to improve the Performance of LLMs with Knowledge Extraction aglorithms. The framework is designed to extract and utilize contextual knowledge from multiple modalities, such as video, audio, and text.
Demo: https://cakeframework.vercel.app/
CAKE (Context-Aware Knowledge Extraction) is a research project focused on improving the performance of Large Language Models (LLMs) by extracting and utilizing contextual knowledge from multiple modalities, such as video, audio, and text. The project aims to bridge the gap between structured knowledge management and AI-driven decision support.
Traditional knowledge management systems rely on structured databases and predefined taxonomies, but they struggle to capture nuanced and context-dependent knowledge. This project proposes a flexible framework to extract, structure, and store crucial knowledge for effective retrieval and application in real-world scenarios.
Can Large Language Models utilize and extract crucial knowledge from different modalities to improve the accuracy and the quality of generated responses?
- What methodologies exist for knowledge extraction, and how can they be adapted for LLM's?
- What are the challenges in extracting knowledge from multi-modal data?
- What role does knowledge have in enhancing the performance of Large Language Models over time?
- What are the challenges in extracting knowledge from multi-modal data?
- What are the computational and ethical considerations in deploying such a framework?
This research introduces a Context-Aware Knowledge Extraction pipeline that automatically processes multi-modal (video, audio, or text) data to extract, structure, and retrieve valuable knowledge. The framework includes:
-
Semantic Context Transcription Pipeline
- Audio data processing using Whisper and TensorFlow LLama.cpp.
- Text chunking using Semantic Double-Pass Merging for improved context preservation.
-
Semantic Knowledge Extraction Pipeline
- Local AI processing for efficient knowledge extraction.
- Contextual embeddings for improved knowledge representation.
-
Integration of Pipelines
- Combining extracted knowledge with Large Language Models to improve the accuracy of generated responses.
- Implementing FAISS for fast similarity search for extracted knowledge in the vector database.
- LLMs: Llama3.2-11B, Qwen2.5-7B
- Speech-to-Text: Whisper
- Chunking & Segmentation: Chonkie, SAM-2 (Segment Anything Model)
- Knowledge Retrieval & Vector Database: FAISS (Facebook AI Similarity Search)
- Frameworks & Libraries: TensorFlow, PyTorch, Hugging Face, React, Flask, LLama_cpp
- Development of a structured knowledge extraction framework.
- Integration of multi-modal data for improved LLM context-awareness.
- Implementation of a FAISS-based vector database retrieval mechanism.
- Exploration of knowledge graphs and their role for Large Language Models.
git clone https://github.com/AntoniovanDijck/CAKE.git
cd CAKE
The CAKE pipeline provides flexibility to execute different components as needed. You can run the knowledge extraction pipeline, evaluate the pipeline, and test extracted knowledge via chatbot using command-line arguments.
Before running the pipeline, install the required dependencies:
pip install -r requirements.txt
Use the following command-line arguments to run specific components:
-run
→ Run the knowledge extraction pipeline.-eval
→ Run the evaluation process.-chat
→ Test the extracted knowledge with a chatbot interface.-run_all
→ Run all components (pipeline, evaluation, and chat).
# Run the full pipeline
python main.py -run
# Evaluate the pipeline
python main.py -eval
# Test extracted knowledge via chatbot
python main.py -chat
# Run all components (pipeline, evaluation, and chatbot)
python main.py -run_all
# Change video_url in main.py to the desired video link
video_url = "https://www.youtube.com/watch?v=example"
To evaluate more additional LLM models, place their GGUF files inside the `models/` directory. The evaluation script will automatically detect and include them in the evaluation.
# Place GGUF model files in the 'models/' directory
./pipelines/Knowledge_Extraction_Pipeline/data/models/
# Run the evaluation script
python main.py -eval
The script loads all available models in the models/
folder and runs the evaluation process.
The CAKE demo web application provides an interactive interface to test the knowledge extraction pipeline and chatbot functionality. The web application also has a visualizer for the extracted knowledge base, the web application is built using React and Flask.
cd CAKE_webapp
pip install -r requirements.txt
npm install
# Run the web application
# Don't forget to put a openAI API key in the backend/app.py file!
npm run dev & python backend/app.py
The notebooks directory contains Jupyter notebooks for the different components of the CAKE pipeline. These notebooks give a detailed explanation of the code and the underlying concepts that were developed during the research.
The framework successfully extracts and structures critical knowledge, leading to improved accuracy in LLM-generated responses. The FAISS-based retrieval system enhances real-time knowledge access for technical support applications, reducing dependency on static documentation or Retrieval Augmented Generation (RAG).
The evaluation results show a significant improvement in the accuracy of LLM-generated responses when using the extracted knowledge. The framework's ability to retrieve contextually relevant knowledge contributes to more accurate and informative responses.
The framework's integration with LLMs results in a substantial improvement in the quality of generated responses. The extracted knowledge enhances the context-awareness of LLMs, leading to more relevant and accurate answers.
The knowledge graph visualization provides an intuitive representation of the extracted knowledge, enabling users to explore the relationships between different concepts and entities. The graph visualization enhances the understanding of complex knowledge structures and facilitates knowledge discovery.
- Further optimization of retrieval mechanisms.
- Further optimization of knowledge graph construction algorithms and ontology.
- Exploring the impact of real-time knowledge updates.
- Exploring methods of knowledge management.
- Investigating the impact of the top-k parameter of the retrieved knowledge on the quality of generated responses.
- Implementing SAM-2 and a Visual Question Answering Model (VQA) for multi-modal knowledge extraction.
For a comprehensive list of related research and citations, please refer to the Bibliography section in the thesis document.
I would like to express my gratitude to my supervisors, Dr. Ir. J.R. Helmus and Dr. S. van Splunter, for their invaluable guidance and support throughout this research. Also a special thanks to Jesse Jan van Schouten for his insights and collaboration.
Student Number: 12717673
Bachelor Thesis
Bachelor Information Sciences
University of Amsterdam
Faculty of Science
Dr. Ir. J.R. Helmus
Dr. S. van Splunter
Informatics Institute
Faculty of Science
University of Amsterdam
Antonio Adrian Cornelis van Dijck: Context-Aware Knowledge Extraction Framework
Jesse Jan van Schouten: Semantic Context Transcription Pipeline