Code for "Complexity-aware fine-tuning" paper

General purpose Large Language Models (LLMs) are frequently fine-tuned to improve performance in niche domains. Although fine-tuning is a standard practice, we still lack a deep understanding of how to aggregate data for better results. In this work, we show that the entropy-based output estimation provides a meaningful guideline for fine-tuning data preparation. Specifically, across two small open models ~3B$ we find that a single token answer entropy shows ROC AUC score of ~0.73 and allows us to split the training data into three complexity categories. Moreover, we discover that these categories require different tuning mechanisms. Leveraging these insights, we propose a novel blueprint for efficient fine-tuning that outperforms the standard approach (TODO vs TODO accuracy). We also provide an in-depth analysis of alternative complexity estimation techniques based on expert assessment via model-as-judge (MASJ), entropy aggregation, and reasoning metadata with ROC AUC scores of 0.57, TODO and TODO accordingly. Our findings facilitate immediate enhancements in fine-tuning performance. In addition, we path the way to further investigation and immersion of the numerical complexity analysis.

Prerequisites

uv

Data

Download CoT entropy data for MMLU to data/out/cot_entropy
Download reasoning data for MMLU to data/out/reasoning_entropy

Running experiments

uv run src/experiments/REPLACE_ME.py

Cite

TODO

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
data		data
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Code for "Complexity-aware fine-tuning" paper

Prerequisites

Data

Running experiments

Cite

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

LabARSS/complexity-aware-fine-tuning

Folders and files

Latest commit

History

Repository files navigation

Code for "Complexity-aware fine-tuning" paper

Prerequisites

Data

Running experiments

Cite

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages