Machine Learning Predicts the Level of Disease Spread


Dhio Saputra , Irzal Arief Wisky , Sarjon Defit






Vol. 10 No. 4 (2024): April


Decision tree, Machine learning, Naive bayes, Spread of disease

Research Articles


How to Cite

Saputra, D., Wisky, I. A., & Defit, S. (2024). Machine Learning Predicts the Level of Disease Spread. Jurnal Penelitian Pendidikan IPA, 10(4), 1714–1722.


Download data is not yet available.


Metrics Loading ...


The aim of the research is predictive analysis of the spread of disease. Variable analysis at the population level in a region and the total disease events detected in the community. These variables can show the accuracy and certainty of the status of the resulting analysis. The concept of Machine Learning analysis is proposed to develop previous analysis models. The methods used include the K-Means cluster, Naïve Bayes, and Decision Tree (DT). There are two stages in the analysis process: pre-processing and classification. The discussion presented by K-Means provides a classification analysis pattern. The patterns obtained will be passed on to the classification process using Naïve Bayes and DT. Naïve Bayes results provide quite significant results with an accuracy rate of 83.33%. DT can also describe the results of information and knowledge analysis in the form of decision trees. DT produces decision trees that can provide knowledge and information analysis. The DT results provide an accuracy rate of 91.76% so these results can be used as consideration in decision making. The resulting information and knowledge can be used as a guide in making policies for handling health in the community.


Ahmad, A., Garhwal, S., Ray, S. K., Kumar, G., Malebary, S. J., & Barukab, O. M. (2021). The Number of Confirmed Cases of Covid-19 by using Machine Learning: Methods and Challenges. Archives of Computational Methods in Engineering, 28(4), 2645–2653.

Ahmed, S. F., Alam, M. S. B., Hassan, M., Rozbu, M. R., Ishtiak, T., Rafa, N., Mofijur, M., Shawkat Ali, A. B. M., & Gandomi, A. H. (2023). Deep learning modeling techniques: Current progress, applications, advantages, and challenges. Artificial Intelligence Review, 56(11), 13521–13617.

Ahsan, M. M., Luna, S. A., & Siddique, Z. (2022). Machine-Learning-Based Disease Diagnosis: A Comprehensive Review. Healthcare, 10(3), 541.

Aldoseri, A., Al-Khalifa, K. N., & Hamouda, A. M. (2023). Re-Thinking Data Strategy and Integration for Artificial Intelligence: Concepts, Opportunities, and Challenges. Applied Sciences, 13(12), 7082.

Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M. A., Al-Amidie, M., & Farhan, L. (2021). Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 8(1), 53.

Badawy, M., Ramadan, N., & Hefny, H. A. (2023). Healthcare predictive analytics using machine learning and deep learning techniques: A survey. Journal of Electrical Systems and Information Technology, 10(1), 40.

Bertozzi, A. L., Franco, E., Mohler, G., Short, M. B., & Sledge, D. (2020). The challenges of modeling and forecasting the spread of COVID-19. Proceedings of the National Academy of Sciences, 117(29), 16732–16738.

Bukhari, S. N. H., Webber, J., & Mehbodniya, A. (2022). Decision tree-based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates. Scientific Reports, 12(1), 7810.

Carroll, L. N., Au, A. P., Detwiler, L. T., Fu, T., Painter, I. S., & Abernethy, N. F. (2014). Visualization and analytics tools for infectious disease epidemiology: A systematic review. Journal of Biomedical Informatics, 51, 287–298.

Chakri, P., Pratap, S., Lakshay, & Gouda, S. K. (2023). An exploratory data analysis approach for analyzing financial accounting data using machine learning. Decision Analytics Journal, 7, 100212.

Ciaburro, G., & Iannace, G. (2022). Machine-Learning-Based Methods for Acoustic Emission Testing: A Review. Applied Sciences, 12(20), 10476.

Elbasi, E., Zaki, C., Topcu, A. E., Abdelbaki, W., Zreikat, A. I., Cina, E., Shdefat, A., & Saker, L. (2023). Crop Prediction Model Using Machine Learning Algorithms. Applied Sciences, 13(16), 9288.

Elhussein, M., & Brahimi, S. (2021). Clustering as a feature selection method in spam classification: Uncovering sick-leave sellers. Applied Computing and Informatics.

Gul, M., & Rehman, M. A. (2023). Big data: An optimized approach for cluster initialization. Journal of Big Data, 10(1), 120.

Herodotou, C., Rienties, B., Boroowa, A., Zdrahal, Z., & Hlosta, M. (2019). Large-scale implementation of predictive learning analytics in higher education: The teachers’ role and perspective. Educational Technology Research and Development, 67(5), 1273–1306.

Ishaque, S., Khan, N., & Krishnan, S. (2023). Physiological Signal Analysis and Stress Classification from VR Simulations Using Decision Tree Methods. Bioengineering, 10(7), 766.

Javaid, M., Haleem, A., Singh, R. P., Suman, R., & Gonzalez, E. S. (2022). Understanding the adoption of Industry 4.0 technologies in improving environmental sustainability. Sustainable Operations and Computers, 3, 203–217.

Keshavamurthy, R., Dixon, S., Pazdernik, K. T., & Charles, L. E. (2022). Predicting infectious disease for preparedness and response: A systematic review of machine learning and deep learning approaches. One Health, 15, 100439.

Lepenioti, K., Bousdekis, A., Apostolou, D., & Mentzas, G. (2020). Prescriptive analytics: Literature review and research challenges. International Journal of Information Management, 50, 57–70.

Li, C., Chen, Y., & Shang, Y. (2022). A review of industrial big data for decision making in intelligent manufacturing. Engineering Science and Technology, an International Journal, 29, 101021.

Li, M., Frank, E., & Pfahringer, B. (2023). Large-scale K-means clustering using GPUs. Data Mining and Knowledge Discovery, 37(1), 67–109.

Lin, C.-L., & Fan, C.-L. (2019). Evaluation of CART, CHAID, and QUEST algorithms: A case study of construction defects in Taiwan. Journal of Asian Architecture and Building Engineering, 18(6), 539–553.

Lopes, J., Guimarães, T., & Santos, M. F. (2020). Predictive and Prescriptive Analytics in Healthcare: A Survey. Procedia Computer Science, 170, 1029–1034.

Mussabayev, R., Mladenovic, N., Jarboui, B., & Mussabayev, R. (2023). How to Use K-means for Big Data Clustering? Pattern Recognition, 137, 109269.

Parra, X., Tort-Martorell, X., Alvarez-Gomez, F., & Ruiz-Viñals, C. (2023). Chronological Evolution of the Information-Driven Decision-Making Process (1950–2020). Journal of the Knowledge Economy, 14(3), 2363–2394.

Petropoulos, F., Apiletti, D., Assimakopoulos, V., Babai, M. Z., Barrow, D. K., Ben Taieb, S., Bergmeir, C., Bessa, R. J., Bijak, J., Boylan, J. E., Browell, J., Carnevale, C., Castle, J. L., Cirillo, P., Clements, M. P., Cordeiro, C., Cyrino Oliveira, F. L., De Baets, S., Dokumentov, A., & Ziel, F. (2022). Forecasting: Theory and practice. International Journal of Forecasting, 38(3), 705–871.

Purwanto, A. D., Wikantika, K., Deliar, A., & Darmawan, S. (2022). Decision Tree and Random Forest Classification Algorithms for Mangrove Forest Mapping in Sembilang National Park, Indonesia. Remote Sensing, 15(1), 16.

Rupp, N., Ries, R., Wienbruch, R., & Zuchner, T. (2024). Can I benefit from laboratory automation? A decision aid for the successful introduction of laboratory automation. Analytical and Bioanalytical Chemistry, 416(1), 5–19.

Saha, D., & Manickavasagan, A. (2021). Machine learning techniques for analysis of hyperspectral images to determine the quality of food products: A review. Current Research in Food Science, 4, 28–44.

Santangelo, O. E., Gentile, V., Pizzo, S., Giordano, D., & Cedrone, F. (2023). Machine Learning and Prediction of Infectious Diseases: A Systematic Review. Machine Learning and Knowledge Extraction, 5(1), 175–198.

Sarker, I. H. (2021a). Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Computer Science, 2(6), 420.

Sarker, I. H. (2021b). Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science, 2(3), 160.

Sghir, N., Adadi, A., & Lahmer, M. (2023). Recent advances in Predictive Learning Analytics: A Decade systematic review (2012–2022). Education and Information Technologies, 28(7), 8299–8333.

Shipe, M. E., Deppen, S. A., Farjah, F., & Grogan, E. L. (2019). Developing prediction models for clinical use using logistic regression: An overview. Journal of Thoracic Disease, 11(S4), S574–S584.

Taye, M. M. (2023). Understanding of Machine Learning with Deep Learning: Architectures, Workflow, Applications and Future Directions. Computers, 12(5), 91.

Toma, M., & Wei, O. C. (2023). Predictive Modeling in Medicine. Encyclopedia, 3(2), 590–601.

Tuli, S., Tuli, S., Tuli, R., & Gill, S. S. (2020). Predicting the growth and trend of the COVID-19 pandemic using machine learning and cloud computing. Internet of Things, 11, 100222.

Uddin, S., Khan, A., Hossain, M. E., & Moni, M. A. (2019). Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making, 19(1), 281.

Wang, C.-L., Chan, Y.-K., Chu, S.-W., & Yu, S.-S. (2022). R-Reference points based k-means algorithm. Information Sciences, 610, 204–214.

Wiersinga, W. J., Rhodes, A., Cheng, A. C., Peacock, S. J., & Prescott, H. C. (2020). Pathophysiology, Transmission, Diagnosis, and Treatment of Coronavirus Disease 2019 (COVID-19): A Review. JAMA, 324(8), 782.

Author Biographies

Dhio Saputra, Universitas Putra Indonesia YPTK Padang

Irzal Arief Wisky, Universitas Putra Indonesia YPTK Padang

Sarjon Defit, Universitas Putra Indonesia YPTK Padang


Copyright (c) 2024 Dhio Saputra, Irzal Arief Wisky, Sarjon Defit

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with Jurnal Penelitian Pendidikan IPA, agree to the following terms:

  1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution 4.0 International License (CC-BY License). This license allows authors to use all articles, data sets, graphics, and appendices in data mining applications, search engines, web sites, blogs, and other platforms by providing an appropriate reference. The journal allows the author(s) to hold the copyright without restrictions and will retain publishing rights without restrictions.
  2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in Jurnal Penelitian Pendidikan IPA.
  3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).