kappaTune

A PyTorch-based optimizer wrapper for continual learning via selective fine-tuning, guided by the condition number ($\kappa$) of model tensors. KappaTune identifies and updates only the least anisotropic parameters to preserve pre-trained knowledge and mitigate catastrophic forgetting.

Please cite the following paper if you use this code or ideas derived from it in your publications: https://arxiv.org/html/2506.16289v1

Introduction

kappaTune is designed to address the challenge of catastrophic forgetting in continual learning scenarios. By analyzing the condition numbers of a neural network's weight matrices, it selects a subset of parameters to fine-tune. This approach updates only tensors with the smallest condition numbers due to a synergy of factors: their inherent numerical stability makes them less susceptible to training noise, and their less specialized nature allows for robust adaptation without overwriting critical, highly specific pre-training knowledge, thereby effectively mitigating catastrophic forgetting of foundational capabilities, as shown in the paper.

Features

Condition Number Guided Selection: Ranks model parameters based on their condition numbers, prioritizing those that are less anisotropic (more "round" in their singular value distribution).
Selective Fine-Tuning: Integrates with any standard PyTorch optimizer, ensuring only the selected parameters are updated.
Efficient Analysis: Caches condition numbers to avoid redundant computations across multiple runs or experiments.
Flexible Filtering: Allows skipping parameters based on number of dimensions, or maximum dimension size, providing fine-grained control over which tensors are considered for analysis.
Catastrophic Forgetting Mitigation: By selectively updating parameters, kappaTune helps preserve pre-trained knowledge, making it suitable for continual learning and domain adaptation tasks.

kappaTune vs. LoRA

While LoRA is highly effective for reducing training costs through parameter-efficient fine-tuning, it doesn’t inherently include a strategy to prevent catastrophic forgetting. In contrast, kappaTune is purpose-built for continual learning; it offers better retention of prior knowledge and also reduces computational effort as a side effect by selectively updating only a small subset of model tensors.

Installation

Prerequisites

Python 3.8+
pip package manager

Dependencies

You can install the required libraries using pip:

pip install torch transformers datasets numpy

Usage

complete_example_use_selective_fine_tuning.py demonstrates how to use kappaTune to fine-tune a TinyLlama-1.1B model on a text classification dataset (ag_news), selectively updating parameters based on their condition numbers. Note that while ag_news is a classification dataset, the example code performs a language modeling (next-token prediction) task only to illustrate the LLM adaptation.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
complete_example_use_selective_fine_tuning.py		complete_example_use_selective_fine_tuning.py
selective_fine_tuning.py		selective_fine_tuning.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

kappaTune

Table of Contents

Introduction

Features

kappaTune vs. LoRA

Installation

Prerequisites

Dependencies

Usage

About

Uh oh!

Releases

Packages

Languages

License

oswaldoludwig/kappaTune

Folders and files

Latest commit

History

Repository files navigation

kappaTune

Table of Contents

Introduction

Features

kappaTune vs. LoRA

Installation

Prerequisites

Dependencies

Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages