Skip to content
/ 266-fp Public

We introduce a binarized approach to Lexical Complexity Prediction (Binary LCP) and systematically compare two generations of encoder-only Transformer models: BERT and ModernBERT. Work completed as part of Natural Language Processing, DATASCI 266.

Notifications You must be signed in to change notification settings

JH-UCB/266-fp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Directory Structure


.
├── data
│   ├── processed
│   │   └── 266-comp-lex-master
│   │       ├── fe-test-labels
│   │       ├── fe-train
│   │       └── fe-trial-val
│   └── raw
│       ├── cwi18-complex-word-identification-master
│       │   ├── testset
│       │   │   ├── english
│       │   │   ├── french
│       │   │   ├── german
│       │   │   └── spanish
│       │   └── traindevset
│       │       ├── english
│       │       ├── german
│       │       └── spanish
│       └── se21-t1-comp-lex-master
│           ├── test
│           ├── test-labels
│           ├── train
│           └── trial
├── literature
│   ├── 2016-2018 CWI
│   │   └── 2018 results
│   │       ├── 1-cwirankingclass
│   │       └── CWI-results-regression-teams-SS-new
│   ├── 2021-task-1-lexical-complexity-prediction
│   └── model optimization
├── models
├── notebooks
│   ├── 1-2 Data Engineering
│   ├── 3_0 Baselines
│   ├── 3_0-3_5 Pipeline Development and Hyperparameter Refinement
│   ├── 3_6-3_8 Ablation Studies
│   ├── 4_0 Training Results Log Parser
│   ├── 5_0 Visualizations
│   ├── 6_0 Error Analysis
│   └── pdf_converter
├── paper_and_slides
└── results

41 directories



About

We introduce a binarized approach to Lexical Complexity Prediction (Binary LCP) and systematically compare two generations of encoder-only Transformer models: BERT and ModernBERT. Work completed as part of Natural Language Processing, DATASCI 266.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published