This diagram illustrates the workflow in sentiment analysis using the IndoBERT model.
- Data Crawling - Data is collected and stored in the form of datasets.
- Pre-processing & Labeling - Data is processed with techniques such as text cleaning, case folding, text normalisation, tokenisation, stopwords removal, and utilisation of InSet Lexicon to provide sentiment labels.
- Model Train & Test - IndoBERT model is utilised and fine-tuned to improve the accuracy of sentiment analysis.
- Model Evaluation - Model results were evaluated using Confusion Matrix to measure classification performance.
- Implementation & Results - Analysis results are visualised to further understand model performance.
Data that has been crawled previously is preprocessed so that it is ready to be processed in the algorithm used, there are 6 stages of data preprocessing used in this research.
this is the content of the tokopedia.csv dataset
Remove unnecessary characters or symbols, such as punctuation marks, numbers, or special characters, to improve the quality of text data.
Convert all letters to lowercase so that there is no distinction between uppercase and lowercase letters (for example, ‘Product’ and ‘product’ are considered the same).
Text normalisation is used to convert nonstandard or slang words into standard word forms according to the Indonesian formal language dictionary.
Tokenising is used to separate a sentence into tokens.
Stopwords removal is used to remove common words (stopwords) that do not provide significant information value.
InSet Lexicon is a collection of words to give a specific label, such as positive, negative, or neutral.
From the classification model with three classes of positive, neutral, and negative. The model performed well with an overall accuracy of 0.89, which means that 89% of all model predictions were correct. In terms of precision, the model showed the highest value in the negative class (0.94), followed by neutral (0.83) and positive (0.77), which indicates that the model is very accurate in predicting the negative class. For recall, the model did well in detecting the positive and negative classes, with values of 0.86 and 0.94 respectively, but slightly less well in detecting the neutral class (0.71). F1-score, which combines precision and recall, shows good results for the negative (0.94) and positive (0.81) classes, but there is a slight decrease in the neutral class (0.76). Macro average and weighted average showed average f1-score values of 0.84 and 0.89 respectively, indicating that despite the imbalance in class distribution, the model still performed well on the more dominant data. Overall, although there is room for improvement in detecting the neutral class, the model managed to provide excellent results, especially in predicting the negative class.
Loss graph depicting the development of training loss, validation loss, and test loss over five training epochs of the fine-tuned IndoBERT model. In this graph, the training loss (blue line) decreases consistently from epoch 1 to epoch 5, indicating that the model is getting better at learning the training data over time. The validation loss (green line) also decreases, albeit slower than the training loss, indicating that the model is getting better at generalising patterns in the data that were not seen before. Meanwhile, the test loss (red line) follows a similar pattern to the validation loss, albeit with slight fluctuations, indicating that the model can make better predictions on the test data, but there is a slight instability compared to the training data. The consistent decrease in these three types of loss indicates that the fine-tuned IndoBERT model shows good progress in learning patterns from the data, as well as the ability to generalise to data not seen during training.
For the classification model with three classes positive, neutral, and negative. Based on this matrix, the model performed best in classifying the negative class, with 638 negative data correctly predicted as negative. However, there were some misclassifications in the positive and neutral classes. A total of 21 positive data were incorrectly predicted as negative, and there were 6 positive data that were predicted as neutral. On the other hand, the model also misclassified some neutral data, with 19 neutral data incorrectly predicted as negative and 21 neutral data predicted as positive. Despite this, the model performed well in identifying the negative class, but these errors show that the model needs to be improved, especially in distinguishing between positive and neutral.
‘Showing 5 words that present Positive sentiment’ shows a word cloud depicting the words that appear most frequently in reviews with positive and negative sentiment. In the first image, which is themed around positive sentiment, words such as ‘good’, ‘fast’ appear at a larger size, indicating that customers who leave positive reviews use those words more often to describe the products or services they praise.
‘Showing 5 words that present Negative sentiment’ shows words such as “goods” “defective,” “disappointing” as the words that appear the most. This shows that dissatisfied customers tend to use these words to express their disappointment with the product or service.
‘Displaying marketing strategies’ shows a table containing review data, which includes columns such as merge_text (review content), sentiment (review sentiment), and marketing_strategy (suggested marketing strategies based on sentiment and keyword analysis). This table includes reviews with negative, positive, and neutral sentiments, and suggested marketing strategies such as apologising and returning items for negative sentiments, giving discounts or coupons for positive sentiments, and saying thank you for neutral reviews. This table helps to formulate marketing strategies that match the sentiments present in customer reviews. Overall, these figures provide useful insights into how sentiment in reviews can be used to design more effective marketing strategies.
The model showed a significant improvement in performance, with the accuracy value reaching 0.89. This reflects the model's better ability to understand and analyse product review data. 3. Marketing strategies based on the analysis results. Based on the findings from the word cloud and review table, it can be concluded that customers with positive sentiments tend to rate products or services favourably, using words such as ‘fast,’ ‘satisfied,’ and ‘steady,’ indicating high satisfaction with the quality and service provided. Therefore, companies can utilise these positive reviews to offer discounts or coupons as a form of appreciation or incentive for future purchases. In contrast, customers with negative sentiments tend to use words such as ‘defective,’ ‘disappointing,’ and ‘broken,’ which indicate dissatisfaction with the quality of the product or service. For this reason, the suggested marketing strategy is to apologise and return goods as a step to improve customer relations and maintain the company's reputation.