Python Code Refactoring Analysis with ChatGPT

This project analyzes Python code refactoring using ChatGPT, focusing on Pythonic idioms such as list comprehensions. It addresses three research questions (RQ1–RQ3) related to refactoring consistency, feature differences, and reasoning references.

📦 Installation

Clone this repository.
Install dependencies:

pip install -r requirements.txt

Create a .env file in the project root with your OpenAI API key:

OPENAI_API_KEY=[your api key]

📂 Dataset

The dataset is located at:

csv_files/code_review_total_code_900.csv

📥 Fetching Files

Some files are already stored in:

downloaded_files/

To fetch additional files:

Open fetch_files.py
Adjust the configuration (currently set to fetch only list comprehension idiom files).

Run:

python fetch_files.py

🤖 Running Inference with ChatGPT

To generate ChatGPT refactorings:

python inference.py

You can modify:

Number of iterations
Selection criteria (e.g., files with < 10k characters)

After running inference.py, extract all code feature metrics by running:

python analyze_python_files_ast.py

This will create:

csv_files/base_code_summary.csv
csv_files/selected_base_code_summary.csv

📊 Research Questions

RQ1: How many refactorings involve list comprehensions?

Run:

python compare_iterations.py

Output:

csv_files/compare_iterations_list_comps.csv

RQ2: What are the feature differences between original and AI-refactored code?

Run:

python compare_features.py

Outputs:

csv_files/selected_comparison.csv
csv_files/selected_comparison_p_values.csv

RQ3: Does AI-generated reasoning reference specific code elements?

The raw reasoning file is:

result/[file_name]/reasoning.txt

For better formatting:

python convert_readme.py

This converts reasoning.txt into a Markdown-formatted file.

📜 Additional Utilities

add_original.py
Adds the original source file alongside its refactored counterpart in the result folder for side-by-side comparison.

📁 Project Structure

csv_files/                 # CSV datasets and results
downloaded_files/          # Source files fetched from repositories
junk/                      # Temporary files
plots/                     # Visualizations
result/                    # AI-generated outputs and reasoning
requirements.txt           # Dependencies
fetch_files.py             # Fetches source files
inference.py               # Runs refactoring inference
analyze_python_files_ast.py# Extracts code metrics
compare_iterations.py      # RQ1 analysis
compare_features.py        # RQ2 analysis
convert_readme.py          # Formats reasoning output
add_original.py            # Adds original code to results

"# Idiomatic-Refactoring"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Python Code Refactoring Analysis with ChatGPT

📦 Installation

📂 Dataset

📥 Fetching Files

🤖 Running Inference with ChatGPT

📊 Research Questions

RQ1: How many refactorings involve list comprehensions?

RQ2: What are the feature differences between original and AI-refactored code?

RQ3: Does AI-generated reasoning reference specific code elements?

📜 Additional Utilities

📁 Project Structure

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
csv_files		csv_files
downloaded_files		downloaded_files
junk		junk
plots		plots
result		result
.gitignore		.gitignore
README.md		README.md
add_original.py		add_original.py
analyze_python_files_ast.py		analyze_python_files_ast.py
compare_features.py		compare_features.py
compare_iterations.py		compare_iterations.py
convert_readme.py		convert_readme.py
fetch_files.py		fetch_files.py
inference.py		inference.py
inference_once.py		inference_once.py
requirements.txt		requirements.txt

EarnGH/Idiomatic-Refactoring

Folders and files

Latest commit

History

Repository files navigation

Python Code Refactoring Analysis with ChatGPT

📦 Installation

📂 Dataset

📥 Fetching Files

🤖 Running Inference with ChatGPT

📊 Research Questions

RQ1: How many refactorings involve list comprehensions?

RQ2: What are the feature differences between original and AI-refactored code?

RQ3: Does AI-generated reasoning reference specific code elements?

📜 Additional Utilities

📁 Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages