"Turning coffee into neural networks since 2025" ☕➡️🤖
Welcome to my AI project portfolio! This repository contains implementations and analyses of fundamental NLP techniques and modern transformer architectures. Below you'll find Batman-style tech briefings for each project!
CBOW vs Skip-Gram vs GloVe
NLP Fundamentals with Reuters Financial News
- Implemented 3 classic embedding models from scratch
- Developed multi-modal evaluation framework:
- KNN Semantic Clustering 👯
- SimLex-999 Benchmark 📈
- Vector Arithmetic for Analogies ➕➖
Model | KNN Clustering | SimLex-999 ρ | Analogy Accuracy |
---|---|---|---|
CBOW | 🌕🌕🌗🌑🌑 | 0.0954 | 0% |
SkipGram | 🌕🌕🌑🌑🌑 | 0.0504 | 0% |
GloVe | 🌕🌑🌑🌑🌑 | 0.0659 | 0% |
💡 Epiphany Moment: Even financial jargon needs bigger embeddings! (64-dim wasn't cutting it)
Chinese News Classification
Battling with 10 News Categories
- Scaled encoder layers: 2 → 8 🏗️
- Enhanced classification head with GAP 🎯
- Cosine decay + warmup scheduling 🔥
Version | Accuracy | F1-Score | Key Improvement |
---|---|---|---|
Baseline | 81.19% | 82.55% | Initial Transformer |
+Preprocess | 83.53% | 83.34% | Punctuation Ninjutsu ✂️ |
Final Model | 84.07% | 84.19% | Deep Encoder Magic 🧙 |
Hot Take 🔥: Commas matter! But sometimes they don't... 🤷
Dual Task Dominance
Sentiment Analysis + Paraphrase Detection
{'tasks': ['SST2', 'MRPC'],
'model': 'bert-mini',
'secret_sauce': 'Custom LossCallback() 🕵️',
'hardware': 'Enough CUDA cores to fry an egg 🍳'}
Task | Accuracy | F1-Score | Prediction Prowess |
---|---|---|---|
SST2 | 82.80% | 83.11% | 4/5 Test Samples Correct 🎬 |
MRPC | 75.25% | 82.43% | 5/5 Real-world Correct 🌍 |
Golden Insight 💡: Small BERTs can play big! (But they still hate irony)
please refer to reports accrodingly to each project:
📜 Project Reports
Project | Report Link |
---|---|
Word Embeddings Analysis | project1-2_report.pdf |
News Classification | project3_report.pdf |
BERT Classification | project4_report.pdf |
-
Subword embeddings for rare financial terms 💼
-
Hybrid positional encoding strategies 🧬
-
Attention visualization toolkit 👀
-
Domain-adaptive pretraining 🌐
Made with ❤️ (and probably too much caffeine) by Zijin Cai
"If debugging is removing bugs, then programming must be putting them in." - Edsger Dijkstra