This repository contains the official implementation of DiReC (Disentangled Contrastive Representation), a two-stage framework for tutor identity classification. This work was submitted to the BEA 2025 Shared Task 5: Tutor Identity Classification, where it achieved 3rd place with a macro-F1 score of 0.9172.
The goal of the task is to classify responses from nine different tutors: seven Large Language Models (LLMs) and two human tutors (novice and expert). Our approach, DiReC, leverages disentangled representation learning to separate the semantic content of a response from its stylistic features, which is crucial for identifying the authoring tutor.
DiReC uses a microsoft/deberta-v3-large
encoder as its backbone. The [CLS]
token embedding from the encoder is passed through two separate projection heads to create disentangled content and style embeddings. These two embeddings are then concatenated and fed into a linear classifier to predict the tutor's identity.

The core idea is that the content embedding captures what is being said (semantics, facts), while the style embedding captures how it is being said (tone, verbosity, lexical choice), which is a strong signal for tutor identity.
To effectively disentangle the representations, we employ a two-stage training strategy:
Stage 1: Content-Focused Pre-training
- The style projection head is frozen.
- The model (encoder, content head, classifier) is trained using only Cross-Entropy (CE) Loss.
- This stage forces the model to learn robust, discriminative features based purely on the semantic content of the responses.
Stage 2: Joint Disentangled Training
- The style projection head is unfrozen.
- The model is trained jointly with a combined loss function:
-
Cross-Entropy Loss (
$\mathcal{L}_{CE}$ ): Continues to guide the primary classification task. -
Supervised Contrastive Loss (
$\mathcal{L}_{SupCon}$ ): Applied to the style embeddings. This encourages responses from the same tutor to have similar style representations, pulling them closer in the embedding space. -
Disentanglement Loss (
$\mathcal{L}_{dis}$ ): A cosine-based loss that penalizes similarity between the content and style embeddings, enforcing their orthogonality.
-
Cross-Entropy Loss (
The final loss function in Stage 2 is: $$\mathcal{L} = \lambda_{CE}\mathcal{L}{CE} + \lambda{sty}\mathcal{L}{SupCon} + \lambda{dis}\mathcal{L}_{dis}$$
Our two-stage DiReC model achieved a macro-F1 score of 0.9042 on the validation set. By replacing the final linear classifier with a CatBoost classifier trained on the learned content and style embeddings, performance improved to 0.9101.
For the final submission, we applied the Hungarian algorithm as a post-processing step to ensure unique tutor assignments within each conversation, which resulted in a final leaderboard score of 0.9172 [Macro F1].



-
Clone the repository:
git clone https://github.com/your-username/DiReC.git cd DiReC
-
Create a Python virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up environment variables: Create a
.env
file in the root directory and add your API keys:WANDDB_API_KEY="your_weights_and_biases_api_key" WANDDB_EXPERIMENT_NAME="DiReC-Tutor-Classification" HF_TOKEN="your_hugging_face_api_token"
-
Download the dataset: Place the
cleaned_mrbench_devset.csv
andcleaned_mrbench_testset.csv
files into adataset/
directory in the project root.
Follow this two-step process to train and evaluate the full model.
Run the first script to train the base transformer model. This will save the model weights in the models/
directory.
python DiReC.py
The script will:
- Load and preprocess the data.
- Perform the two-stage training procedure.
- Log training progress, metrics, and visualizations to Weights & Biases.
- Evaluate the final model on the validation set.
- Save the best performing model's state dictionary to the
models/
directory.
python DiReC_Catboost.py
After the DiReC encoder is trained, run the second script. This script will:
- Load the trained DiReC model from
models/direc_model_final.pth
. - Extract content and style embeddings for the train and validation sets.
- Train a CatBoost classifier on the extracted embeddings.
- Evaluate the final classifier and save it to
models/catboost_on_direc_embeddings.cbm
.
You can adjust hyperparameters such as BATCH_SIZE
, LR
, num_epochs
, and loss weights at the top of the DiReC.py
script.
If you find this work useful, please consider citing the original paper:
@inproceedings{tjitrahardja-hanif-2025-two,
title = "Two Outliers at {BEA} 2025 Shared Task: Tutor Identity Classification using {DiReC}, a Two-Stage Disentangled Contrastive Representation",
author = "Tjitrahardja, Eduardus and Hanif, Ikhlasul Akmal",
booktitle = "Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications",
month = jun,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
}