Skip to content

Lizhecheng02/1st-Solution-UCSD-CSE291H-Kaggle-Restaurant-Type-Prediction

Repository files navigation

This GitHub Repo is the 1st Solution for UCSD-CSE291H-Kaggle-Restaurant-Type-Prediction (FA24)

Author: Zhecheng Li && Professor: Jingbo Shang

Python Environment

1. Install Packages

pip install -r requirements.txt

Prepare Data

1. Set Kaggle Api

export KAGGLE_USERNAME="your_kaggle_username"
export KAGGLE_KEY="your_api_key"
export HF_TOKEN="your_hf_token"

2. Install unzip

sudo apt install unzip

3. Download Datasets

cd dataset
kaggle datasets download -d lizhecheng/cse-291h-kaggle-competition-data
unzip cse-291h-kaggle-competition-data.zip

4. Download LoRA Adapters (only use review)

cd lora/adapters
kaggle datasets download -d lizhecheng/cse291h-competition-lora-adapters
unzip cse291h-competition-lora-adapters.zip

5. Download LoRA Adapters (use both name and review)

cd lora/adapters
kaggle datasets download -d lizhecheng/cse291h-competition-lora-adapters-name-and-review
unzip cse291h-competition-lora-adapters-name-and-review.zip

Run Codes

1. Direct LLM Inference

If you want directly use LLMs for inference, the code is under llm directory. I provide the code for Few-Shot Chain of Thought (CoT) inference using OpenAI models. (You can use different api models whether from groq or llamaapi platform).

  • First, you need to set your api key in the config.yaml file.
  • Second, you can change the prompt or few-shot numbers in the main.py.
  • Run python main.py.

2. LLM Classification Fine-tuning

If you want to fine-tune the LLMs for classification task, the code is under lora directory. I provide the code for QLoRA Fine-tuning in order to make LLMs work for the classification.

  • First, you can change the parameters in the run.sh file.
  • Second, run chmod +x ./run.sh on GPU server if you do not have the access.
  • Run ./run.sh to execute the code.
2.1 Llama3.2-3B Fine-tuning

I use the meta-llama/Llama-3.2-3B model and 10 fold cross validation to train 10 models and then use majority vote to ensemble 10 submission files. Each model was trained on single 40GB A100 GPU for 5 hours. During the fine-tuning, we can use only 'review' column or both 'review' and 'name' columns. If you want to use both columns, reserve the line 69 in src.py, otherwise, comment it out.

Here is the macro_f1 curve for 10 models on validation set (only used the review column for fine-tuning):

llama3.2-3B-curve

Here is the macro_f1 curve for 10 models on validation set (used both name and review columns for fine-tuning):

llama3.2-3B-curve-both

Here are the parameters I used for training these ten models:

  • max_length = 1536 (inference max_length can be 2048)
  • lora_r = 16
  • lora_alpha = 16
  • lora_dropout = 0.1
  • warmup_ratio = 0.1
  • learning_rate = 2.25e-4
  • batch_size = 1
  • accumulation_steps = 16
  • weight_decay = 0.001
  • epochs = 2
  • lr_scheduler = "cosine"

3. Fine-tune Encoder Language Models

If you want to fine-tune encoder-based models such as BERT, DeBERTa, the code is under encoder-models directory. I provide the code for fine-tuning any encoder-based models with AWP (Adversarial Weight Perturbation) technique, which is used to enhance the robustness of fine-tuned models.

  • First, you can change the parameters in config.py. (Here I provide various pooling layers implementation to connect with the final classification layer)
  • If you want to fine-tune the model, run python train.py.
  • If you want to evaluate the model on test dataset, you need to set your model path in either normal_inference.py or trainer_inference.py, then you can run python normal_inference.py or python trainer_inference.py. (You should keep the parameters in config.py to be the same as training period)

Score Records

  • gpt-4o-mini 10-shot CoT (use only review) -> LB 0.778
  • Llama3.2-3B (use only review) -> LB 0.845 (max_length=2048, single fold can reach 0.848; majority vote ensemble -> 0.851)
  • Llama3.2-3B (use both name and review) -> LB 0.856 (max_length=2048, single fold can reach 0.860; majority vote ensemble -> 0.864)

Tips

  • Try gpt-4o or even o1-preview with n-shot CoT using both restaurant name and review.
  • Fine-tune LLMs with longer sequence length, such as 4096. (May require 80GB A100 GPUs)
  • Use full fine-tuning instead of QLoRA. (May require 80GB A100 GPUs)

Leaderboard

lb

pb

About

[2024Fall CSE291-DSC253 Advanced Data-Driven Text Mining] You are asked to design models to predict the restaurant type using the observed variables.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published