Skip to content

Applied-Machine-Learning-Lab/SEED-Attack

Repository files navigation

[ACL'25 Main] Stepwise Reasoning Disruption Attack of LLMs

Code implementation of ACL paper "Stepwise Reasoning Disruption Attack of LLMs"

Usage

1. Question Modification (QuestionModification.py)

This component modifies original questions while preserving their semantic meaning.

python QuestionModification.py \
    --llm_name <model_name> \
    --dataset <dataset_name> \
    --few_shot <True/False> \

2. Solution Generation (GetSolutionofQuestionModified.py)

Generates CoT solutions for modified questions.

python GetSolutionofQuestionModified.py \
    --llm_name <model_name> \
    --dataset <dataset_name> \
    --few_shot <True/False> \

3. SEED-P Attack (SEEDpAttack.py)

Performs the SEED-P attack by introducing prior reasoning steps of the modified question.

python SEEDpAttack.py \
    --llm_name <model_name> \
    --dataset <dataset_name> \
    --ratio <float> \
    --few_shot <True/False> \

4. Evaluation (Evaluation.py)

Evaluation the Accuracy and Attack Success Rate.

Step1: Run baseline (no attack) for comparison:

python SEEDpAttack.py \
    --llm_name <model_name> \
    --dataset <dataset_name> \
    --ratio 0.0 \
    --few_shot <True/False> \

Step 2: Compute ASR and Accuracy:

python Evaluation.py 
    --llm_name <model_name> \
    --dataset <dataset_name> \
    --ratio <float> \
    --few_shot <True/False> \

About

[ACL'25]Code implementation of ACL‘25 paper "Stepwise Reasoning Disruption Attack of LLMs"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages