Adversarial-AI-Red-teaming

A hands-on AI red teaming project exploring prompt injection, encoding bypass, jailbreaks, and embedded attacks across modern LLMs like DeepSeek AI, Grok AI and CoPilot

🛡️ AI Security: Adversarial Attacks, Risks & Mitigations

This repository presents a hands-on overview of adversarial testing techniques used against large language models (LLMs) and enterprise AI tools. It covers real-world vulnerabilities including prompt injection, encoding bypass, jailbreaks, and embedded attacks—along with actionable mitigations.

📌 Use Cases

Explore how attacks were crafted and mitigated:

Test Type	Description
🧠 Prompt Injection	Override system instructions and manipulate AI behavior.
🔐 Encoding Bypass	Evade filters using Base64 and obfuscation.
🧩 Crescendo Jailbreak	Bypass safety by extracting info step-by-step.
📄 Embedded Prompt Injection	Hide malicious instructions in documents or emails.

🧪 LLMS Tested

Copilot for Microsoft 365
DeepSeek AI
Grok AI

🛠️ Recommendations for AI Security

Red Team AI systems regularly to uncover emerging risks.
Apply behavioral and context-aware filtering beyond simple prompt matching.
Sanitize user input and file content before LLM processing.
Establish security governance frameworks for internal AI tools.

🔒 AI Security Pillars

Fairness
Reliability & Safety
Privacy & Security
Inclusiveness
Transparency
Accountability

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
case-studies		case-studies
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Adversarial-AI-Red-teaming

🛡️ AI Security: Adversarial Attacks, Risks & Mitigations

📌 Use Cases

🧪 LLMS Tested

🛠️ Recommendations for AI Security

🔒 AI Security Pillars

Connect with me on [linkedin] (https://www.linkedin.com/in/reshmimehta/)

About

Uh oh!

Releases

Packages

ReshmiMehta14/Adversarial-AI-Red-teaming

Folders and files

Latest commit

History

Repository files navigation

Adversarial-AI-Red-teaming

🛡️ AI Security: Adversarial Attacks, Risks & Mitigations

📌 Use Cases

🧪 LLMS Tested

🛠️ Recommendations for AI Security

🔒 AI Security Pillars

Connect with me on [linkedin] (https://www.linkedin.com/in/reshmimehta/)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages