Optimizing CSAT Through Sentiment Analysis and Predictive Modeling Techniques

Summary

Customer Satisfaction (CSAT) is a key performance metric in any service-driven organisation. However, CSAT scores often fail to capture the full customer experience, especially when sentiment in textual feedback contradicts the given score.

To address this, we developed an AI-based analytics pipeline that combines Natural Language Processing (NLP) and Machine Learning (ML) to:

Automatically extract sentiment from customer comments.
Align this sentiment with CSAT scores to identify misclassifications or hidden dissatisfaction.
Predict satisfaction levels from structured support ticket.

The result is a smarter, scalable system for understanding customer satisfaction, enabling better resource planning and faster issue resolution.

🎯 Project Aim: To enhance customer satisfaction analysis by aligning sentiment from customer feedback with CSAT scores, and to develop accurate predictive models that can forecast CSAT outcomes using Machine Learning techniques.

Solution Overview

🧠 Phase 1: Sentiment Detection from Customer Comments

Started with AFINN, a rule-based lexicon, which produced inaccurate classifications.
However, as shown below, it fails to capture the actual sentiment of customer feedback correctly.

Due to AFINN’s limitations, we explored a more advanced BERT model, which significantly improved sentiment detection.
Upgraded to a fine-tuned BERT model, significantly improving sentiment accuracy by understanding context, negation, and domain-specific phrasing.

Result: A multilingual BERT model fine-tuned on historical feedback data achieved high classification accuracy and contextual understanding.

🔁 Phase 2: Aligning Sentiment with CSAT Scores

The model detected 5.75% of user-labeled Negative feedback as actually Positive, improving sentiment alignment.
It also detected 0.65% of user-labeled Positive feedback as actually Negative, uncovering hidden dissatisfaction.
Business Impact: This alignment improves the credibility of CSAT reporting and helps surface hidden service issues.

🧩 Phase 3: Predicting CSAT with Machine Learning

We used structured ticket metadata (country, region, sentiment polarity, etc.) to train multiple classification models:

Logistic Regression
Random Forest
Support Vector Machine (SVM)
Gradient Boosting Machine (GBM)

All models achieved high accuracy, but Logistic Regression performed best on ROC-AUC, highlighting its strength in distinguishing satisfaction levels.

Evaluation Metrics:
Accuracy – Overall correctness of predictions.
ROC-AUC Score – Ability to distinguish sentiment polarity.
Precision, Recall, F1-score – Balance between false positives & false negatives.
Confusion Matrix – Insights into correct vs. misclassified instances.

Key Finding:

Logistic Regression achieved the highest ROC-AUC (0.9512), demonstrating its superior ability to distinguish sentiment polarity, despite high accuracy across all models (97.62%).

Summary of Outcomes

✅ BERT Model outperforms traditional methods in interpreting customer sentiment.

✅ Sentiment-CSAT misalignment highlights previously unseen issues.

✅ Predictive ML models enable accurate classification of customer satisfaction.

✅ Logistic Regression offers strong, interpretable performance for operational use.

Future Enhancements

Expand to Multilingual Feedback – Incorporate customer reviews in different languages to improve global applicability.
Feature Expansion – Add Ticket Priority, User Type, and additional metadata for better prediction accuracy.
Explore Advanced Transformers – Investigate more sophisticated NLP models for improved sentiment classification.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
images		images
2. CSAT Data Cleaning.R		2. CSAT Data Cleaning.R
3. CSAT Data Visualisation.R		3. CSAT Data Visualisation.R
4. Sentiment Analysis score on CSAT (BING Overall).R		4. Sentiment Analysis score on CSAT (BING Overall).R
5. Sentiment Analysis score on CSAT (NRC).R		5. Sentiment Analysis score on CSAT (NRC).R
6. Sentiment Analysis score on CSAT (AFINN).R		6. Sentiment Analysis score on CSAT (AFINN).R
Final Predictive Modeling V.ipynb		Final Predictive Modeling V.ipynb
Final SA - Fine-tuning a pre-trained BERT model-18-12 .ipynb		Final SA - Fine-tuning a pre-trained BERT model-18-12 .ipynb
README.md		README.md
SA.Rmd		SA.Rmd
SA.html		SA.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Optimizing CSAT Through Sentiment Analysis and Predictive Modeling Techniques

Summary

Solution Overview

🧠 Phase 1: Sentiment Detection from Customer Comments

🔁 Phase 2: Aligning Sentiment with CSAT Scores

🧩 Phase 3: Predicting CSAT with Machine Learning

Summary of Outcomes

Future Enhancements

About

Uh oh!

Releases

Packages

Languages

AzlinRusnan/Optimizing_CSAT_Through_Sentiment-Analysis_and_Predictive-Modeling

Folders and files

Latest commit

History

Repository files navigation

Optimizing CSAT Through Sentiment Analysis and Predictive Modeling Techniques

Summary

Solution Overview

🧠 Phase 1: Sentiment Detection from Customer Comments

🔁 Phase 2: Aligning Sentiment with CSAT Scores

🧩 Phase 3: Predicting CSAT with Machine Learning

Summary of Outcomes

Future Enhancements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages