Evaluating Argumentative Strategies in AI Debates

This repository contains the development of my Data Science Bachelor's Thesis (UBA) on evaluating argumentative strategies in debates between AI systems. The research explores the dynamics of using debate as an alignment technique, focusing on scenarios with agents of different capabilities and cognitively limited judges.

📖 Overview

The goal of this thesis is to study how unrestricted AI agents can exploit tactics such as deception, misleading arguments, or false consensus appeals to win debates against honest and harmless agents. Additionally, the research will analyze the impact of having a lower-capacity judge evaluate arguments from more advanced agents and how this affects the debate’s convergence towards the truth.

The work will include:

Simulating debates between agents with asymmetric capabilities.
Evaluating argumentative strategies and their effectiveness.
Analyzing the impact of judges with limited knowledge.
Comparing different metrics to assess the truthfulness and coherence of debates.

📌 Key Topics

AI Safety via Debate: Using debate as a mechanism to improve AI safety.
Game Theory: Modeling strategic interactions between agents.
Natural Language Processing (NLP): Implementing language models for argumentation.
Evaluation Models: Designing metrics to measure the quality of arguments.

📚 References

This research is based on AI safety and alignment techniques, including:

📬 Contact

Joaquín Salvador Machulsky
Email: jmachulsky@dc.uba.ar

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
debate_mnist		debate_mnist
papers		papers
pdf		pdf
templates		templates
.gitignore		.gitignore
README.md		README.md
prop_0039.pdf		prop_0039.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Evaluating Argumentative Strategies in AI Debates

📖 Overview

📌 Key Topics

📚 References

📬 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

machulsky61/tesis

Folders and files

Latest commit

History

Repository files navigation

Evaluating Argumentative Strategies in AI Debates

📖 Overview

📌 Key Topics

📚 References

📬 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages