This GitHub Repo is the 1st Solution for UCSD-CSE291H-Kaggle-Restaurant-Type-Prediction (FA24)
Author: Zhecheng Li && Professor: Jingbo Shang
pip install -r requirements.txt
export KAGGLE_USERNAME="your_kaggle_username"
export KAGGLE_KEY="your_api_key"
export HF_TOKEN="your_hf_token"
sudo apt install unzip
cd dataset
kaggle datasets download -d lizhecheng/cse-291h-kaggle-competition-data
unzip cse-291h-kaggle-competition-data.zip
cd lora/adapters
kaggle datasets download -d lizhecheng/cse291h-competition-lora-adapters
unzip cse291h-competition-lora-adapters.zip
cd lora/adapters
kaggle datasets download -d lizhecheng/cse291h-competition-lora-adapters-name-and-review
unzip cse291h-competition-lora-adapters-name-and-review.zip
If you want directly use LLMs for inference, the code is under llm
directory. I provide the code for Few-Shot Chain of Thought (CoT) inference using OpenAI models. (You can use different api models whether from groq or llamaapi platform).
- First, you need to set your api key in the
config.yaml
file. - Second, you can change the prompt or few-shot numbers in the
main.py
. - Run
python main.py
.
If you want to fine-tune the LLMs for classification task, the code is under lora
directory. I provide the code for QLoRA Fine-tuning in order to make LLMs work for the classification.
- First, you can change the parameters in the
run.sh
file. - Second, run
chmod +x ./run.sh
on GPU server if you do not have the access. - Run
./run.sh
to execute the code.
I use the meta-llama/Llama-3.2-3B model and 10 fold cross validation to train 10 models and then use majority vote to ensemble 10 submission files. Each model was trained on single 40GB A100 GPU for 5 hours. During the fine-tuning, we can use only 'review' column or both 'review' and 'name' columns. If you want to use both columns, reserve the line 69
in src.py
, otherwise, comment it out.
Here is the macro_f1 curve for 10 models on validation set (only used the review column for fine-tuning):
Here is the macro_f1 curve for 10 models on validation set (used both name and review columns for fine-tuning):
Here are the parameters I used for training these ten models:
max_length
=1536
(inferencemax_length
can be2048
)lora_r
=16
lora_alpha
=16
lora_dropout
=0.1
warmup_ratio
=0.1
learning_rate
=2.25e-4
batch_size
=1
accumulation_steps
=16
weight_decay
=0.001
epochs
=2
lr_scheduler
="cosine"
If you want to fine-tune encoder-based models such as BERT, DeBERTa, the code is under encoder-models
directory. I provide the code for fine-tuning any encoder-based models with AWP (Adversarial Weight Perturbation) technique, which is used to enhance the robustness of fine-tuned models.
- First, you can change the parameters in
config.py
. (Here I provide various pooling layers implementation to connect with the final classification layer) - If you want to fine-tune the model, run
python train.py
. - If you want to evaluate the model on test dataset, you need to set your model path in either
normal_inference.py
ortrainer_inference.py
, then you can runpython normal_inference.py
orpython trainer_inference.py
. (You should keep the parameters inconfig.py
to be the same as training period)
gpt-4o-mini
10-shot CoT (use only review) -> LB 0.778Llama3.2-3B
(use only review) -> LB 0.845 (max_length=2048, single fold can reach 0.848; majority vote ensemble -> 0.851)Llama3.2-3B
(use both name and review) -> LB 0.856 (max_length=2048, single fold can reach 0.860; majority vote ensemble -> 0.864)
- Try
gpt-4o
or eveno1-preview
with n-shot CoT using both restaurant name and review. - Fine-tune LLMs with longer sequence length, such as 4096. (May require 80GB A100 GPUs)
- Use full fine-tuning instead of QLoRA. (May require 80GB A100 GPUs)