Skip to content

This repo documents my participation in the Kaggle red-teaming competition focused on probing OpenAI's newly released gpt-oss-20b model for previously undiscovered vulnerabilities and harmful behaviors. The goal is to identify, document, and report up to five distinct issues, contributing to the safety and alignment of open-source AI models.

Notifications You must be signed in to change notification settings

OMCHOKSI108/Red-Teaming-Challenge-OpenAI-gpt-oss-20b

Repository files navigation

Red-Teaming Challenge - OpenAI gpt-oss-20b

Welcome to my repository for the OpenAI gpt-oss-20b Red-Teaming Challenge!

Overview

This repo documents my participation in the Kaggle red-teaming competition focused on probing OpenAI's newly released gpt-oss-20b model for previously undiscovered vulnerabilities and harmful behaviors. The goal is to identify, document, and report up to five distinct issues, contributing to the safety and alignment of open-source AI models.

📚 Notebook

Open In Colab

Interactive Notebook: Access the complete red-teaming notebook on Google Colab for hands-on experimentation with the gpt-oss-20b model.

Challenge Objectives

  • Find flaws and vulnerabilities in gpt-oss-20b (not previously reported)
  • Document exploits with reproducible reports and code
  • Share insights to improve AI safety and alignment

Topics of Interest

  • Reward hacking
  • Deception & deceptive alignment
  • Sabotage
  • Inappropriate tool use
  • Data exfiltration
  • Sandbagging
  • Evaluation awareness
  • Chain of Thought issues

Submission Format

  • Kaggle Writeup (project summary, strategy, findings)
  • Up to 5 findings files (JSON)
  • (Optional) Reproduction notebook
  • (Optional) Open-source tooling

Timeline

  • Start: August 5, 2025
  • End: August 26, 2025

Repository Structure

  • Challange.txt: Full competition details and rules
  • README.md: This file
  • gpt_oss_20b_colab_final.ipynb: Main notebook with model setup and red-teaming experiments
  • (To be added) Findings, additional notebooks, and tooling

Getting Started

I have just joined the challenge and will be updating this repository with:

  • My discovery process and methodology
  • Vulnerability findings and reports
  • Reproducible code and notebooks

Stay tuned for updates as I progress through the competition!


Citation: D. Sculley, Samuel Marks, and Addison Howard. Red‑Teaming Challenge - OpenAI gpt-oss-20b. Kaggle Competition, 2025.

About

This repo documents my participation in the Kaggle red-teaming competition focused on probing OpenAI's newly released gpt-oss-20b model for previously undiscovered vulnerabilities and harmful behaviors. The goal is to identify, document, and report up to five distinct issues, contributing to the safety and alignment of open-source AI models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published