Smart Summarizer: Turning Long Articles into Clear Insights (with BERT & T5)

Hello and welcome!

This project tackles a problem we all face too much content, too little time. I built a smart summarization pipeline using BERT and T5 to turn long articles into concise, meaningful summaries.

It was developed during my Master’s program at UC Irvine and reflects my passion for using NLP to solve real-world problems in a thoughtful and scalable way.

What this project does

Automatically summarizes long-form articles into clear, focused insights
Combines extractive and abstractive approaches using BERT and T5
Evaluates results using ROUGE metrics and human readability checks
Surfaces key entities and topics for better decision-making

Why I built it

Summarization is more than just model performance it’s about making information useful. I wanted to explore how advanced NLP techniques can help people process content faster without losing context or nuance.

This project blends:

Practical problem-solving
Cutting-edge NLP models
Business-focused evaluation

Tools & Technologies

Models: T5 (abstractive summarization), BERT (for contextual embeddings and hybrid modeling)
Libraries: Hugging Face Transformers, Scikit-learn, Pandas, NumPy
Evaluation: ROUGE-1, ROUGE-2, ROUGE-L, and human alignment reviews
Extras: Topic clustering, NER-based insights, ensemble summarization logic

How it works

Data Preprocessing: Cleaned and prepared article + summary pairs
Modeling: Fine-tuned T5 for summarization; used BERT to enhance extractive logic
Ensembling: Combined outputs for higher relevance and clarity
Evaluation: Assessed using ROUGE metrics and manual reviews
Insights: Extracted key entities and clustered summaries by theme

Results

Improved ROUGE scores by 12–18% through ensemble modeling
Reduced reading time by over 60%
Summaries consistently retained context, clarity, and tone

Project Structure

summarization-nlp/
notebooks/ # Main notebooks for experimentation
data/ # Training data, summaries, insights
evaluation/ # ROUGE scores and comparison files
README.md # This file

Why it matters

For recruiters and data science leads, this project shows:

I can build and fine tune modern NLP models
I prioritize clarity, usability, and evaluation not just accuracy
I understand how to deliver real business value through data science
I take pride in clean documentation and end-to-end ownership

A quick example

Original article: A long editorial on post-pandemic market shifts, economic policy, and consumer behavior Generated summary: Consumer confidence rose following stimulus changes, with notable growth in housing and travel. Short-term inflation risks remain.

Let’s connect!

Thanks for taking the time to explore this project! If you’d like to discuss NLP, machine learning, or potential collaborations, feel free to reach out.

Prisha Chawla

Gmail: prishachawla10@gmail.com

Linkedln: https://www.linkedin.com/in/prisha-chawla

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
NPL_project.ipynb		NPL_project.ipynb
README.md		README.md
article_insights.csv		article_insights.csv
optimized_ensemble_summaries.xlsx		optimized_ensemble_summaries.xlsx
summarization.png		summarization.png
summarized_articles.csv		summarized_articles.csv
train_dataset.csv		train_dataset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Smart Summarizer: Turning Long Articles into Clear Insights (with BERT & T5)

Hello and welcome!

What this project does

Why I built it

Tools & Technologies

How it works

Results

Project Structure

Why it matters

A quick example

Let’s connect!

About

Uh oh!

Releases

Packages

Languages

prisha03/bert-t5-article-summarizer

Folders and files

Latest commit

History

Repository files navigation

Smart Summarizer: Turning Long Articles into Clear Insights (with BERT & T5)

Hello and welcome!

What this project does

Why I built it

Tools & Technologies

How it works

Results

Project Structure

Why it matters

A quick example

Let’s connect!

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages