An accurate and efficient sentiment classification system for Indonesian text, powered by IndoBERT. This project demonstrates how a fine-tuned transformer model can effectively classify sentiment in real-world Indonesian documents.
๐ง Model Highlights
- Based on indobenchmark/indobert-base-p1 โ a pre-trained BERT model for the Indonesian language
- Fine-tuned on document-level sentiment dataset
- Supports 3 sentiment classes:
- Negative
- Neutral
- Positive
- Achieves strong performance on test data with high precision and recall
- Inference-ready and optimized for deployment
๐ Use Case
This model is designed for Indonesian-language applications such as:
- Social media monitoring
- Customer feedback analysis
- Product review classification
- Public opinion mining
๐ Example
Input:
"Merasa kagum dengan toko ini tapi berubah menjadi kecewa setelah transaksi"
Output:
๐ง Prediction: negative (93.2%)
โ๏ธ Tech Stack
- PyTorch โ deep learning framework
- HuggingFace Transformers โ for loading and managing the IndoBERT model
- Git LFS โ to store large model weights (>500MB)
- Streamlit โ for quick demo deployment (optional)
๐บ Try the Web App
Want to see the model in action?
๐ Access the live UI here: https://sentiment-analysis-indonlu.streamlit.app/
๐ Model File
Make sure the trained model is stored in the following path:
- model/best_model.pt
๐งช Training
The model was trained using a custom implementation of the DocumentSentimentDataset and DocumentSentimentDataLoader from IndoNLU. Training utilized:
- Adam optimizer
- Custom metrics calculation
- GPU acceleration (CUDA)
- Validation-based evaluation per epoch
๐ Acknowledgements
- IndoNLU โ for Indonesian NLP datasets and benchmarks
- HuggingFace โ for providing model architectures and tokenizer support
- Adityo Pangestu โ for training, optimizing, and deploying the model
๐ฌ Contact
Created by Adityo Pangestu ยท adityopangestu01@gmail.com
Feel free to contribute or extend this project for other NLP tasks such as topic modeling, emotion detection, or intent classification.