Sentiment Analysis Classification on PLN Mobile Application Reviews using Random Forest Method and TF-IDF Feature Extraction
DOI:
https://doi.org/10.31963/intek.v11i1.4774Keywords:
sentiment analysis, user reviews, pln mobile, random forest, tf-idfAbstract
PT PLN (Persero) has developed the PLN Mobile application to provide electricity services. The large number of users has resulted in various reviews regarding the strengths, weaknesses, and issues of the application. To evaluate the application's quality, sentiment analysis is conducted on user reviews. Review data is obtained through the Google-Play-Scraper API and cleaned through text preprocessing. This study utilizes the TF-IDF feature extraction method and Random Forest for classification. TF-IDF involves weighting each word in a text. This method transforms words into numerical representations, indicating both their frequency and relevance within the document's context. Random Forest is a supervised machine learning algorithm that utilizes ensemble learning which categorizes reviews into positive and negative, This study produced the best model using stemming data and TF-IDF unigram, along with a combination of hyperparameters. The n-estimator was set to 100, max_feature to log2, max_depth to unlimited (none), and entropy criterion, resulting in the highest F1-Score of up to 93.14%.References
Y. Asri, W. N. Suliyanti, D. Kuswardani, and M. Fajri, “Analisis Sentimen Pelabelan Otomatis Lexicon Vader dan Klasifikasi Naive Bayes dalam menganalisis sentimen data ulasan PLN Mobile: Analisis Sentimen”, petir, vol. 15, no. 2, pp. 264–275, Nov. 2022.
B. Liu, Sentiment analysis and opinion mining. Springer, 2012.
I. Afdhal et al., “Penerapan Algoritma Random Forest Untuk Analisis Sentimen Komentar Di YouTube Tentang Islamofobia,” Jurnal Nasional Komputasi dan Teknologi Informasi, vol. 5, no. 1, 2022.
P. Karthika, R. Murugeswari, and R. Manoranjithem, ‘Sentiment Analysis of Social Media Network Using Random Forest Algorithm’, in 2019 IEEE International Conference on Intelligent Techniques in Control, Optimization and Signal Processing (INCOS), 2019, pp. 1–5.
B. Baskoro, I. Susanto, and S. Khomsah, “Analisis Sentimen Pelanggan Hotel di Purwokerto Menggunakan Metode Random Forest dan TF-IDF (Studi Kasus: Ulasan Pelanggan Pada Situs TRIPADVISOR)”, INISTA, vol. 3, no. 2, pp. 21-29, Jun. 2021.
F. A. Larasati, D. E. Ratnawati, dan B. T. Hanggara, “Analisis Sentimen Ulasan Aplikasi Dana dengan Metode Random Forest”, J-PTIIK, vol. 6, no. 9, hlm. 4305–4313, Sep 2022.
T. F. Basar, D. E. Ratnawati, dan I. Arwani, “Analisis Sentimen Pengguna Twitter terhadap Pembayaran Cashless menggunakan Shopeepay dengan Algoritma Random Forest”, J-PTIIK, vol. 6, no. 3, hlm. 1426–1433, Feb 2022.
E. Fitri, ‘Analisis Sentimen Terhadap Aplikasi Ruangguru Menggunakan Algoritma Naive Bayes, Random Forest Dan Support Vector Machine’, Jurnal Transformatika, vol. 18, p. 71, 07 2020.
M. Pribadi, D. Manongga, H. Purnomo, I. Setyawan, and H. Hendry, ‘Sentiment Analysis of the PeduliLindungi on Google Play using the Random Forest Algorithm with SMOTE’, 07 2022, pp. 115–119.
M. R. Adrian, M. P. Putra, M. H. Rafialdy, and N. A. Rakhmawati, ‘Perbandingan Metode Klasifikasi Random Forest dan SVM Pada Analisis Sentimen PSBB’, J. Inform. UPGRIS, vol. 7, no. 1, Jun. 2021.
M. U. Albab, M. N. Fawaiq, and Others, ‘Optimization of the Stemming Technique on Text Preprocessing President 3 Periods Topic’, Jurnal Transformatika, vol. 20, no. 2, pp. 1–12, 2023.
K. Hashim, Y. Sibaroni, and S. Prasetyowati, ‘The Effectiveness of the Ensemble Naive Bayes in Analyzing Review Sentiment of the Lazada Application on Google Play’, 01 2024, pp. 1–5.
Z. Xiao, L. Wang, and J. Y. Du, ‘Improving the performance of sentiment classification on imbalanced datasets with transfer learning’, IEEE Access, vol. 7, pp. 28281–28290, 2019.
A. Wardani, K. Adiwijaya, and M. Dwifebri Purbolaksono, ‘Sentiment Analysis on Beauty Product Review Using Modified Balanced Random Forest Method and Chi-Square’, Journal of Information System Research (JOSH), vol. 4, pp. 1–7, 10 2022.
J. A. Septian, T. M. Fachrudin, and A. Nugroho, “Analisis Sentimen Pengguna Twitter Terhadap Polemik Persepakbolaan Indonesia Menggunakan Pembobotan TF-IDF dan K-Nearest Neighbor”, INSYST, vol. 1, no. 1, pp. 43–49, Aug. 2019.
V. Amrizal, “Penerapan metode Term Frequency Inverse Document Frequency (TF-IDF) dan Cosine Similarity pada sistem temu kembali informasi untuk mengetahui syarah hadits berbasis web (Studi Kasus: Hadits Shahih Bukhari-Muslim),” JURNAL TEKNIK INFORMATIKA, vol. 11, pp. 149–164, 11 2018.
N. Bahrawi, “Sentimen Analysis Using Random Forest Algorithm-Online Social Media Based”, JITU, vol. 2, no. 2, pp. 29–33, Dec. 2019.
G. Sandag, ‘Prediksi Rating Aplikasi App Store Menggunakan Algoritma Random Forest’, CogITo Smart Journal, vol. 6, p. 167, 12 2020.
Y. Nugroho and N. Emiliyawati, ‘Sistem Klasifikasi Variabel Tingkat Penerimaan Konsumen Terhadap Mobil Menggunakan Metode Random Forest’, Jurnal Teknik Elektro, vol. 9, pp. 24–29, 06 2017.
D. Normawati and S. A. Prayogi, ‘Implementasi Naïve Bayes Classifier Dan Confusion Matrix Pada Analisis Sentimen Berbasis Teks Pada Twitter’, J-SAKTI (Jurnal Sains Komputer Dan Informatika), vol. 5, no. 2, pp. 697–711, 2021.
A. Pradana and M. Hayati, ‘The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-language Texts’, Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, vol. 4, 10 2019.
N. Arifin, U. Enri, and N. Sulistiyowati, ‘Penerapan Algoritma Support Vector Machine (SVM) dengan TF-IDF N-Gram untuk Text Classification’, STRING (Satuan Tulisan Riset Dan Inovasi Teknologi), vol. 6, no. 2, pp. 129–136, 2021