GitHub - Lizhecheng02/1st-Solution-UCSD-CSE291H-Kaggle-Restaurant-Type-Prediction: [2024Fall CSE291-DSC253 Advanced Data-Driven Text Mining] You are asked to design models to predict the restaurant type using the observed variables.

This GitHub Repo is the 1st Solution for UCSD-CSE291H-Kaggle-Restaurant-Type-Prediction (FA24)

Author: Zhecheng Li && Professor: Jingbo Shang

Python Environment

1. Install Packages

pip install -r requirements.txt

Prepare Data

1. Set Kaggle Api

export KAGGLE_USERNAME="your_kaggle_username"
export KAGGLE_KEY="your_api_key"
export HF_TOKEN="your_hf_token"

2. Install unzip

sudo apt install unzip

3. Download Datasets

cd dataset
kaggle datasets download -d lizhecheng/cse-291h-kaggle-competition-data
unzip cse-291h-kaggle-competition-data.zip

4. Download LoRA Adapters (only use review)

cd lora/adapters
kaggle datasets download -d lizhecheng/cse291h-competition-lora-adapters
unzip cse291h-competition-lora-adapters.zip

5. Download LoRA Adapters (use both name and review)

cd lora/adapters
kaggle datasets download -d lizhecheng/cse291h-competition-lora-adapters-name-and-review
unzip cse291h-competition-lora-adapters-name-and-review.zip

Run Codes

1. Direct LLM Inference

If you want directly use LLMs for inference, the code is under llm directory. I provide the code for Few-Shot Chain of Thought (CoT) inference using OpenAI models. (You can use different api models whether from groq or llamaapi platform).

First, you need to set your api key in the config.yaml file.
Second, you can change the prompt or few-shot numbers in the main.py.
Run python main.py.

2. LLM Classification Fine-tuning

If you want to fine-tune the LLMs for classification task, the code is under lora directory. I provide the code for QLoRA Fine-tuning in order to make LLMs work for the classification.

First, you can change the parameters in the run.sh file.
Second, run chmod +x ./run.sh on GPU server if you do not have the access.
Run ./run.sh to execute the code.

2.1 Llama3.2-3B Fine-tuning

I use the meta-llama/Llama-3.2-3B model and 10 fold cross validation to train 10 models and then use majority vote to ensemble 10 submission files. Each model was trained on single 40GB A100 GPU for 5 hours. During the fine-tuning, we can use only 'review' column or both 'review' and 'name' columns. If you want to use both columns, reserve the line 69 in src.py, otherwise, comment it out.

Here is the macro_f1 curve for 10 models on validation set (only used the review column for fine-tuning):

Here is the macro_f1 curve for 10 models on validation set (used both name and review columns for fine-tuning):

Here are the parameters I used for training these ten models:

max_length = 1536 (inference max_length can be 2048)
lora_r = 16
lora_alpha = 16
lora_dropout = 0.1
warmup_ratio = 0.1
learning_rate = 2.25e-4
batch_size = 1
accumulation_steps = 16
weight_decay = 0.001
epochs = 2
lr_scheduler = "cosine"

3. Fine-tune Encoder Language Models

If you want to fine-tune encoder-based models such as BERT, DeBERTa, the code is under encoder-models directory. I provide the code for fine-tuning any encoder-based models with AWP (Adversarial Weight Perturbation) technique, which is used to enhance the robustness of fine-tuned models.

First, you can change the parameters in config.py. (Here I provide various pooling layers implementation to connect with the final classification layer)
If you want to fine-tune the model, run python train.py.
If you want to evaluate the model on test dataset, you need to set your model path in either normal_inference.py or trainer_inference.py, then you can run python normal_inference.py or python trainer_inference.py. (You should keep the parameters in config.py to be the same as training period)

Score Records

gpt-4o-mini 10-shot CoT (use only review) -> LB 0.778
Llama3.2-3B (use only review) -> LB 0.845 (max_length=2048, single fold can reach 0.848; majority vote ensemble -> 0.851)
Llama3.2-3B (use both name and review) -> LB 0.856 (max_length=2048, single fold can reach 0.860; majority vote ensemble -> 0.864)

Tips

Try gpt-4o or even o1-preview with n-shot CoT using both restaurant name and review.
Fine-tune LLMs with longer sequence length, such as 4096. (May require 80GB A100 GPUs)
Use full fine-tuning instead of QLoRA. (May require 80GB A100 GPUs)

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
Figs		Figs
dataset		dataset
encoder-models		encoder-models
llm		llm
lora		lora
.gitignore		.gitignore
.python_version		.python_version
README.md		README.md
eda.ipynb		eda.ipynb
probe.ipynb		probe.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

This GitHub Repo is the 1st Solution for UCSD-CSE291H-Kaggle-Restaurant-Type-Prediction (FA24)

Author: Zhecheng Li && Professor: Jingbo Shang

Python Environment

1. Install Packages

Prepare Data

1. Set Kaggle Api

2. Install unzip

3. Download Datasets

4. Download LoRA Adapters (only use review)

5. Download LoRA Adapters (use both name and review)

Run Codes

1. Direct LLM Inference

2. LLM Classification Fine-tuning

2.1 Llama3.2-3B Fine-tuning

3. Fine-tune Encoder Language Models

Score Records

Tips

Leaderboard

About

Uh oh!

Releases

Packages

Languages

Lizhecheng02/1st-Solution-UCSD-CSE291H-Kaggle-Restaurant-Type-Prediction

Folders and files

Latest commit

History

Repository files navigation

This GitHub Repo is the 1st Solution for UCSD-CSE291H-Kaggle-Restaurant-Type-Prediction (FA24)

Author: Zhecheng Li && Professor: Jingbo Shang

Python Environment

1. Install Packages

Prepare Data

1. Set Kaggle Api

2. Install unzip

3. Download Datasets

4. Download LoRA Adapters (only use review)

5. Download LoRA Adapters (use both name and review)

Run Codes

1. Direct LLM Inference

2. LLM Classification Fine-tuning

2.1 Llama3.2-3B Fine-tuning

3. Fine-tune Encoder Language Models

Score Records

Tips

Leaderboard

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages