This project analyzes Python code refactoring using ChatGPT, focusing on Pythonic idioms such as list comprehensions. It addresses three research questions (RQ1–RQ3) related to refactoring consistency, feature differences, and reasoning references.
- Clone this repository.
- Install dependencies:
pip install -r requirements.txt
- Create a
.env
file in the project root with your OpenAI API key:
OPENAI_API_KEY=[your api key]
The dataset is located at:
csv_files/code_review_total_code_900.csv
Some files are already stored in:
downloaded_files/
To fetch additional files:
- Open
fetch_files.py
- Adjust the configuration (currently set to fetch only list comprehension idiom files).
Run:
python fetch_files.py
To generate ChatGPT refactorings:
python inference.py
You can modify:
- Number of iterations
- Selection criteria (e.g., files with
< 10k characters
)
After running inference.py
, extract all code feature metrics by running:
python analyze_python_files_ast.py
This will create:
csv_files/base_code_summary.csv
csv_files/selected_base_code_summary.csv
Run:
python compare_iterations.py
Output:
csv_files/compare_iterations_list_comps.csv
Run:
python compare_features.py
Outputs:
csv_files/selected_comparison.csv
csv_files/selected_comparison_p_values.csv
- The raw reasoning file is:
result/[file_name]/reasoning.txt
- For better formatting:
python convert_readme.py
This converts reasoning.txt
into a Markdown-formatted file.
add_original.py
Adds the original source file alongside its refactored counterpart in theresult
folder for side-by-side comparison.
csv_files/ # CSV datasets and results
downloaded_files/ # Source files fetched from repositories
junk/ # Temporary files
plots/ # Visualizations
result/ # AI-generated outputs and reasoning
requirements.txt # Dependencies
fetch_files.py # Fetches source files
inference.py # Runs refactoring inference
analyze_python_files_ast.py# Extracts code metrics
compare_iterations.py # RQ1 analysis
compare_features.py # RQ2 analysis
convert_readme.py # Formats reasoning output
add_original.py # Adds original code to results
"# Idiomatic-Refactoring"