Skip to content

Eunyoungkim0/StudentReflectionAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Utilizing transformers on short sentence topic modeling

This repository contains scripts and tools for text analysis research, focusing on reflection datasets annotated by multiple individuals. The workflow includes dataset preparation, preprocessing, fine-tuning transformer-based models, and clustering using embeddings.

Dependencies

pandas openpyxl numpy torch nltk transformers scikit-learn seaborn matplotlib datasets

Codes

  1. Dataset Preparation Scripts
  • prepare_dataset.py : Prepares the dataset using rare reflection data annotated by multiple people.
  1. Utility Scripts
  • data_preprocessing.py : Contains preprocessing data for text and data preparation.
  • functions.py : Utility functions used across the fine-tuning and clustering workflows.
  1. Main Scripts
  • finetune_model_bert.py : Trains a transformer-based model (BERT) on the reflection dataset to generate embeddings for downstream tasks.
  • clustering.py : Performs clustering using the embeddings generated by the transformer model.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages