Vol. 10 No. 8 (2024): August
Open Access
Peer Reviewed

A Novel Hybrid Classification on Urban Opinion Using ROS-RF: A Machine Learning Approach

Authors

Usman Ependi , Nahdatul Akma Ahmad

DOI:

10.29303/jppipa.v10i8.8042

Published:

2024-08-31

Downloads

Abstract

Urban opinion from crowdsourced data often leads to imbalanced datasets due to the diversity of issues related to urban social, economic, and environmental topics. This study presents a novel hybrid approach that combines Random Over-Sampling and Random Forest (ROS-RF) to effectively classify such imbalanced data. Using crowdsourced urban opinion data from Jakarta, experimental results show that the ROS-RF method outperforms other approaches. The ROS-RF classifier achieved an impressive F1-score, recall, precision, and accuracy of 98%. These findings highlight the superior effectiveness of the ROS-RF method in classifying urban opinions, especially those related to social, economic, and environmental issues in urban settings. This hybrid approach provides a robust solution for managing imbalanced datasets, ensuring more accurate and reliable classification outcomes. The study underscores the potential of ROS-RF in enhancing urban data analysis and decision-making processes

Keywords:

Covid-19, Machine learning, Lexicon, Sentiment analysis

References

Abdi, A., Shamsuddin, S. M., Hasan, S., & Piran, J. (2019). Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion. Information Processing & Management, 56(4), 1245-1259. https://doi.org/10.1016/j.ipm.2019.02.018.

Castelnovo, W., Misuraca, G., & Savoldelli, A. (2016). Smart cities governance: The need for a holistic approach to assessing urban participatory policy making. Social Science Computer Review, 34(6), 724-739. https://doi.org/10.1177/0894439315611103.

Chen, H., Wu, L., Chen, J., Lu, W., & Ding, J. (2022). A comparative study of automated legal text classification using random forests and deep learning. Information Processing & Management, 59(2), 102798. https://doi.org/10.1016/j.ipm.2021.102798.

Crooks, A., Pfoser, D., Jenkins, A., Croitoru, A., Stefanidis, A., Smith, D., ... & Lamprianidis, G. (2015). Crowdsourcing urban form and function. International Journal of Geographical Information Science, 29(5), 720-741. https://doi.org/10.1080/13658816.2014.977905.

Diniz, M. A. (2022). Statistical methods for validation of predictive models. Journal of Nuclear Cardiology, 29(6), 3248-3255. https://doi.org/10.1007/s12350-022-02994-7.

Ependi, U., Aliya, S., & Wibowo, A. (2023). Sentiment Analysis of Covid-19 Handling in Indonesia Based on Lexicon Weighting. Jurnal Sisfokom (Sistem Informasi dan Komputer), 12(1), 76-82. https://doi.org/10.32736/sisfokom.v12i1.1615.

Ependi, U., Rochim, A. F., & Wibowo, A. (2023a). A Hybrid Sampling Approach for Improving the Classification of Imbalanced Data Using ROS and NCL Methods. International Journal of Intelligent Engineering and Systems, 16(3), 345-361. https://doi.org/10.22266/ijies2023.0630.28.

Ependi, U., Rochim, A. F., & Wibowo, A. (2023b). An assessment model for sustainable cities using crowdsourced data based on general system theory: a design science methodology approach. Smart Cities, 6(6), 3032-3059. https://doi.org/10.3390/smartcities6060136.

Ghahramani, M., Galle, N. J., Duarte, F., Ratti, C., & Pilla, F. (2021). Leveraging artificial intelligence to analyze citizens’ opinions on urban green space. City and Environment Interactions, 10, 100058. https://doi.org/10.1016/j.cacint.2021.100058.

Jalal, N., Mehmood, A., Choi, G. S., & Ashraf, I. (2022). A novel improved random forest for text classification using feature ranking and optimal number of trees. Journal of King Saud University-Computer and Information Sciences, 34(6), 2733-2742. https://doi.org/10.1016/j.jksuci.2022.03.012

Joia, L. A., & Kuhl, A. (2019). Smart city for development: A conceptual model for developing countries. In International conference on social implications of computers in developing countries, 203-214. https://doi.org/10.1007/978-3-030-19115-3_17

Kourtzanidis, K., Angelakoglou, K., Apostolopoulos, V., Giourka, P., & Nikolopoulos, N. (2021). Assessing impact, performance and sustainability potential of smart city projects: Towards a case agnostic evaluation framework. Sustainability, 13(13), 7395. https://doi.org/10.3390/su13137395.

Long, Y., & Liu, L. (2016). Transformations of urban studies and planning in the big/open data era: A review. International Journal of Image and Data Fusion, 7(4), 295-308. https://doi.org/10.1080/19479832.2016.1215355.

Lopez, W., Merlino, J., & Rodriguez-Bocca, P. (2020). Learning semantic information from Internet Domain Names using word embeddings. Engineering Applications of Artificial Intelligence, 94, 103823. https://doi.org/10.1016/j.engappai.2020.103823.

Macrohon, J. J. E., Villavicencio, C. N., Inbaraj, X. A., & Jeng, J. H. (2022). A semi-supervised approach to sentiment analysis of tweets during the 2022 Philippine presidential election. Information, 13(10), 484. https://doi.org/10.3390/info13100484.

Meng, F., Cheng, W., & Wang, J. (2021). Semi-supervised software defect prediction model based on tri-training. KSII Transactions on Internet & Information Systems, 15(11), 40–42. https://doi.org/10.3837/tiis.2021.11.009.

Salles, T., Gonçalves, M., Rodrigues, V., & Rocha, L. (2018). Improving random forests by neighborhood projection for effective text classification. Information Systems, 77, 1-21. https://doi.org/10.1016/j.is.2018.05.006.

Sastrawan, I. K., Bayupati, I. P. A., & Arsa, D. M. S. (2022). Detection of fake news using deep learning CNN–RNN based methods. ICT express, 8(3), 396-408. https://doi.org/10.1016/j.icte.2021.10.003.

Tallo, T. E., & Musdholifah, A. (2018). The implementation of genetic algorithm in smote (synthetic minority oversampling technique) for handling imbalanced dataset problem. In 2018 4th international conference on science and technology (ICST), 1-4. https://doi.org/10.1109/ICSTC.2018.8528591

Tan, S. Y., & Taeihagh, A. (2020). Smart city governance in developing countries: A systematic literature review. sustainability, 12(3), 899. https://doi.org/10.3390/su12030899.

Tomor, Z., Meijer, A., Michels, A., & Geertman, S. (2019). Smart governance for sustainable cities: Findings from a systematic literature review. Journal of urban technology, 26(4), 3-27. https://doi.org/10.1080/10630732.2019.1651178.

Viale Pereira, G., Cunha, M. A., Lampoltshammer, T. J., Parycek, P., & Testa, M. G. (2017). Increasing collaboration and participation in smart city governance: A cross-case analysis of smart city initiatives. Information Technology for Development, 23(3), 526-553. https://doi.org/10.1080/02681102.2017.1353946.

Wang, W., Zhu, X., Lu, P., Zhao, Y., Chen, Y., & Zhang, S. (2024). Spatio-temporal evolution of public opinion on urban flooding: Case study of the 7.20 Henan extreme flood event. International Journal of Disaster Risk Reduction, 100, 104175. https://doi.org/10.1016/j.ijdrr.2023.104175.

Webster, C. W. R., & Leleux, C. (2018). Smart governance: Opportunities for technologically-mediated citizen co-production. Information Polity, 23(1), 95-110. https://doi.org/10.3233/IP-170065.

Zhang, X., & Wang, M. (2021). Weighted random forest algorithm based on bayesian algorithm. In Journal of Physics: Conference Series, 1924(1). https://doi.org/10.1088/1742-6596/1924/1/012006.

Zhong, Yu, & Huiling Wang. (2023). Internet Financial Credit Scoring Models Based on Deep Forest and Resampling Methods. IEEE Access, 11, 8689–8700. https://doi.org/10.1109/ACCESS.2023.3239889.

Author Biographies

Usman Ependi, Universitas Bina Darma

Nahdatul Akma Ahmad, Malaysia

Downloads

Download data is not yet available.

How to Cite

Ependi, U., & Ahmad, N. A. (2024). A Novel Hybrid Classification on Urban Opinion Using ROS-RF: A Machine Learning Approach. Jurnal Penelitian Pendidikan IPA, 10(8), 5816–5824. https://doi.org/10.29303/jppipa.v10i8.8042