Sentimen Analysis Social Media for Disaster using Naïve Bayes and IndoBERT

Authors

  • Sri Mulyani Anugerah Telkom University
  • Rifki Wijaya
  • Moch Arif Bijaksana

DOI:

https://doi.org/10.31963/intek.v11i1.4771

Keywords:

Sentiment Analysis, IndoBERT, Naïve Bayes, Social Media, Disaster Management, Natural Disasters, X

Abstract

The rapid advancement of information and communication technology has resulted in a significant surge in data, especially text data from social media platforms. This paper presents a sentiment analysis approach using IndoBERT and Naïve Bayes algorithms to classify sentiment related to natural disasters, specifically from a dataset of tweets derived from social media platform X. The focus of this research is to categorize tweets as positive and negative sentiment to provide useful insights in improving disaster response and management, with a focus on tweets related to earthquakes, floods, and the eruption of Mount Merapi. The goal is to assist the government in allocating aid more efficiently and understanding public sentiment during disasters. The methodology used includes data collection, data preparation, labeling, categorization, word weighting using tf-idf, data separation, and classification using Naïve Bayes and IndoBERT algorithms. The results showed that IndoBERT achieved 91% accuracy, while Naïve Bayes achieved 74% accuracy. The study highlights the potential of sentiment analysis in improving disaster preparedness and more effective response strategies.

References

A. Zumarniansyah, R. Pebrianto, ; Normah, and W. Gata, “TWITTER SENTIMENT ANALYSIS OF POST NATURAL DISASTERS USING COMPARATIVE CLASSIFICATION ALGORITHM SUPPORT VECTOR MACHINE AND NAÏVE BAYES.” [Online]. Available: www.nusamandiri.ac.id

A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R. Passonneau, “Sentiment Analysis of Twitter Data,” Association for Computational Linguistics, 2011. [Online]. Available: http://www.webconfs.com/stopwords.php

P. Sudhir and V. D. Suresh, “Comparative study of various approaches, applications and classifiers for sentiment analysis,” Global Transitions Proceedings, vol. 2, no. 2, pp. 205–211, Nov. 2021, doi:

1016/j.gltp.2021.08.004.

R. S. Perdana and A. Pinandito, “Combining LikesRetweet Analysis and Naive Bayes Classifier within Twitter for Sentiment Analysis”.

R. Kiatpanont, U. Tanlamai, and P. Chongstitvatana, “Extraction of actionable information from crowdsourced disaster data,” Journal of Emergency Management, vol. 14, no. 6, pp. 377–390, Nov. 2016, doi: 10.5055/jem.2016.0302.

S. Farhana, “Classification of Academic Performance for University Research Evaluation by Implementing Modified Naive Bayes Algorithm,” in Procedia Computer Science, Elsevier B.V., 2021, pp. 224–228. doi: 10.1016/j.procs.2021.10.077.

X. Yang, D. Lo, X. Xia, and J. Sun, “Condensing Class Diagrams With Minimal Manual Labeling Cost,” in Proceedings - International Computer Software and Applications Conference, IEEE Computer Society, Aug. 2016, pp. 22–31. doi: 10.1109/COMPSAC.2016.83.

M. Z. Al-Taie, S. Kadry, and J. P. Lucas, “Online data preprocessing: A case study approach,” International Journal of Electrical and Computer Engineering, vol. 9, no. 4, pp. 2620–2626, Aug. 2019, doi: 10.11591/ijece.v9i4.pp2620-2626.

Ankit and N. Saleena, “An Ensemble Classification System for Twitter Sentiment Analysis,” in Procedia Computer Science, Elsevier B.V., 2018, pp. 937–946. doi: 10.1016/j.procs.2018.05.109.

W. F. Satrya, R. Aprilliyani, and E. H. Yossy, “Sentiment analysis of Indonesian police chief using multi-level ensemble model,” in Procedia Computer Science, Elsevier B.V., 2022, pp. 620–629. doi: 10.1016/j.procs.2022.12.177.

K. A. Hashim and S. S. Prasetyowati, “The Effectiveness of the Ensemble Naive Bayes in Analyzing Review Sentiment of the Lazada Application on Google Play.”

V. S and J. R, “Text Mining: open Source Tokenization Tools – An Analysis,” Advanced Computational Intelligence: An International Journal (ACII), vol. 3, no. 1, pp. 37–47, Jan. 2016, doi: 10.5121/acii.2016.3104.

J. Kaur and P. Kaur Buttar, “A Systematic Review on Stopword Removal Algorithms,” 2018. [Online]. Available: http://www.ijfrcsce.org

V. A. Fitri, R. Andreswari, and M. A. Hasibuan, “Sentiment analysis of social media Twitter with case of Anti-LGBT campaign in Indonesia using Naïve Bayes, decision tree, and random forest algorithm,” in Procedia Computer Science, Elsevier B.V., 2019, pp. 765–772. doi: 10.1016/j.procs.2019.11.181.

S. Fransiska and A. Irham Gufroni, “Sentiment Analysis Provider by.U on Google Play Store Reviews with TF-IDF and Support Vector Machine (SVM) Method,” Scientific Journal of Informatics, vol. 7, no. 2, pp. 2407–7658, 2020, [Online]. Available: http://journal.unnes.ac.id/nju/index.php/sji

“20853-Article Text-41854-1-10-20231024”.

L. Geni, E. Yulianti, and D. I. Sensuse, “Sentiment Analysis of Tweets Before the 2024 Elections in Indonesia Using IndoBERT Language Models,” Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI), vol. 9, no. 3, pp. 746–757, 2023, doi: 10.26555/jiteki.v9i3.26490.

W. Zhang and F. Gao, “An improvement to naive bayes for text classification,” in Procedia Engineering, 2011, pp. 2160–2164. doi: 10.1016/j.proeng.2011.08.404.

H. Chen, S. Hu, R. Hua, and X. Zhao, “Improved naive Bayes classification algorithm for traffic risk management,” EURASIP J Adv Signal Process, vol. 2021, no. 1, Dec. 2021, doi: 10.1186/s13634-021-

-6.

S. Taheri and M. Mammadov, “Learning the naive bayes classifier with optimization models,” International Journal of Applied Mathematics and Computer Science, vol. 23, no. 4, pp. 787–795, 2013, doi: 10.2478/amcs-2013-0059.

G. Z. Nabiilah, I. N. Alam, E. S. Purwanto, and M. F. Hidayat, “Indonesian multilabel classification using IndoBERT embedding and MBERT classification,” International Journal of Electrical and Computer Engineering, vol. 14, no. 1, pp. 1071–1078, Feb. 2024, doi: 10.11591/ijece.v14i1.pp1071-1078

Downloads

Published

2024-04-01

Issue

Section

ARTICLES

Most read articles by the same author(s)