A framework for detecting data leakage and bias in LLMs (e.g., Llama-2, Mistral) using n-gram metrics and one-shot prompting. BLEURT and ROUGE-L models are used to evaluate similarity between reference and model outputs for guided and general prompts. The framework analyzes model behavior on MMLU and TruthfulQA benchmarks to identify training data memorization and gender stereotyping patterns.
-
Notifications
You must be signed in to change notification settings - Fork 2
A research repository exploring potential data leakage vulnerabilities in Large Language Models (LLMs). This work analyzes existing literature, methodologies, and privacy implications in modern LLM architectures, providing comprehensive summaries and insights from various research papers.
Saketh1702/Data-Leakage-Detection-in-LLMs
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
A research repository exploring potential data leakage vulnerabilities in Large Language Models (LLMs). This work analyzes existing literature, methodologies, and privacy implications in modern LLM architectures, providing comprehensive summaries and insights from various research papers.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published