The Node Selection Method for Split Attribute in C4.5 Algorithm Using the Coefficient of Determination Values for Multivariate Data Set
DOI:
10.29303/jppipa.v9i7.4031Published:
2023-07-25Issue:
Vol. 9 No. 7 (2023): JulyKeywords:
Algorithm, Attribute, MultivariateResearch Articles
Downloads
How to Cite
Downloads
Metrics
Abstract
The split attribute in the decision tree algorithm, especially C4.5, has an important influence in producing a decision tree performance that has high predictive performance. This study aims to perform an attribute split in the C4.5 algorithm using the value of the termination coefficient (R2/R Square) which is combined with the aim of increasing the performance of the model performance produced by the C4.5 algorithm itself. The data used in this research are public datasets and private datasets. This study combines the C4.5 algorithm developed by Quinlan. The results in this study indicate that the use of the R2 value in the C4.5 algorithm has good performance in terms of accuracy and recall because three of the four datasets used have a higher value than the C4.5 algorithm without R2. Whereas in the aspect of precision, it has quite good performance because only two datasets have a higher value than the performance results of the algorithm without R2.
References
Albulayhi, K., Abu Al-Haija, Q., Alsuhibany, S. A., Jillepalli, A. A., Ashrafuzzaman, M., & Sheldon, F. T. (2022). IoT Intrusion Detection Using Machine Learning with a Novel High Performing Feature Selection Method. Applied Sciences, 12(10), 5015. https://doi.org/10.3390/app12105015
Andrade, C. (2021). A Student’s Guide to the Classification and Operationalization of Variables in the Conceptualization and Design of a Clinical Study: Part 1. Indian Journal of Psychological Medicine, 43(2), 177–179. https://doi.org/10.1177/0253717621994334
Demisse, G. B., Tadesse, T., & Bayissa, Y. (2017). Data Mining Attribute Selection Approach for Drought Modelling: A Case Study for Greater Horn of Africa. International Journal of Data Mining & Knowledge Management Process, 7(4), 1–16. https://doi.org/10.5121/ijdkp.2017.7401
Delgado-Bonal, A., & Marshak, A. (2019). Approximate Entropy and Sample Entropy: A Comprehensive Tutorial. Entropy, 21(6), 541. https://doi.org/10.3390/e21060541
Hart, J. D. (2017). Use of BayesSim and Smoothing to Enhance Simulation Studies. Open Journal of Statistics, 7(1), 153–172. https://doi.org/10.4236/ojs.2017.71012
Idriss, S., & Lawan, A. (2019). An Improved C4.5 Model Classification Algorithm Based on Taylor’s Series. Jordanian Journal of Computers and Information Technology, 5(1). https://doi.org/10.5455/jjcit.71-1546551963
Ishak, A., Asfriyati, & Akmaliah, V. (2019). Analytical Hierarchy Process and PROMETHEE as Decision Making Tool: A Review. IOP Conference Series: Materials Science and Engineering, 505(1), 012085. https://doi.org/10.1088/1757-899X/505/1/012085
Jenkins, D. G., & Quintana-Ascencio, P. F. (2020). A solution to minimum sample size for regressions. PLOS ONE, 15(2), e0229345. https://doi.org/10.1371/journal.pone.0229345
Kerckhoffs, J., Hoek, G., Portengen, L., Brunekreef, B., & Vermeulen, R. C. H. (2019). Performance of Prediction Algorithms for Modeling Outdoor Air Pollution Spatial Surfaces. Environmental Science & Technology, 53(3), 1413–1421. https://doi.org/10.1021/acs.est.8b06038
Lamrini, B. (2021). Contribution to Decision Tree Induction with Python: A Review. Data Mining—Methods, Applications and Systems. IntechOpen. https://doi.org/10.5772/intechopen.92438
Lee, S., Lee, C., Mun, K. G., & Kim, D. (2022). Decision Tree Algorithm Considering Distances Between Classes. IEEE Access, 10, 69750–69756. https://doi.org/10.1109/ACCESS.2022.3187172
Loftus, T. J., Tighe, P. J., Ozrazgat-Baslanti, T., Davis, J. P., Ruppert, M. M., Ren, Y., Shickel, B., Kamaleswaran, R., Hogan, W. R., Moorman, J. R., Upchurch, G. R., Rashidi, P., & Bihorac, A. (2022). Ideal algorithms in healthcare: Explainable, dynamic, precise, autonomous, fair, and reproducible. PLOS Digital Health, 1(1), e0000006. https://doi.org/10.1371/journal.pdig.0000006
Madadipouya, K. (2017). A Survey on Data Mining Algorithms and Techniques in Medicine. JOIV: International Journal on Informatics Visualization, 1(3), 61. https://doi.org/10.30630/joiv.1.3.25
Mantas, C. J., Abellán, J., & Castellano, J. G. (2016). Analysis of Credal-C4.5 for classification in noisy domains. Expert Systems with Applications, 61, 314–326. https://doi.org/10.1016/j.eswa.2016.05.035
Mao, L., & Zhang, W. (2021). Analysis of entrepreneurship education in colleges and based on improved decision tree algorithm and fuzzy mathematics. Journal of Intelligent & Fuzzy Systems, 40(2), 2095–2107. https://doi.org/10.3233/JIFS-189210
Mienye, I. D., Sun, Y., & Wang, Z. (2019). Prediction performance of improved decision tree-based algorithms: A review. Procedia Manufacturing, 35, 698–703. https://doi.org/10.1016/j.promfg.2019.06.011
Muhsi, (2021). Model dan Analisa Faktor Eksternal Aktifitas Siswa Kelas X TKJ SMKN 1 Pakong Pamekasan Menggunakan Algoritma Decision Tree. Jurnal Aplikasi Teknologi Informasi Dan Manajemen (JATIM), 2(3), 94–106. https://doi.org/10.31102/jatim.v2i2.1239
Muttaqien, R., Pradana, M. G., & Pramuntadi, A. (2021). Implementation of Data Mining Using C4.5 Algorithm for Predicting Customer Loyalty of PT. Pegadaian (Persero) Pati Area Office. International Journal of Computer and Information System (IJCIS), 2(3), 64–68. https://doi.org/10.29040/ijcis.v2i3.36
Nawawi, M. (2020). Influence On Service Quality, Product Quality, Product Design, Price and Trust To Xl Axiata Customer Loyalty On Students Of Pgri Karang Sari Belitang Iii Oku Timur Vocational High School. International Journal of Economics, Business and Accounting Research (IJEBAR), 4(3). https://doi.org/10.29040/ijebar.v4i03.1251
Perwitasari, A. W. (2022). The Effect of Perceived Usefulness and Perceived Easiness towards Behavioral Intention to Use Fintech by Indonesian MSMEs. The Winners, 23(1), 1–9. https://doi.org/10.21512/tw.v23i1.7078
Putra, P. H., Azanuddin, A., Purba, B., & Dalimunthe, Y. A. (2023). Random forest and decision tree algorithms for car price prediction. Jurnal Matematika Dan Ilmu Pengetahuan Alam LLDikti Wilayah 1 (JUMPA), 3(2), 81–89. https://doi.org/10.54076/jumpa.v3i2.305
Riansyah, M., Suwilo, S., & Zarlis, M. (2023). Improved Accuracy in Data Mining Decision Tree Classification Using Adaptive Boosting (Adaboost). SinkrOn, 8(2), 617–622. https://doi.org/10.33395/sinkron.v8i2.12055
Sulistiani, H., & Aldino, A. A. (2020). Decision Tree C4.5 Algorithm for Tuition Aid Grant Program Classification (Case Study: Department of Information System, Universitas Teknokrat Indonesia). Jurnal Ilmiah Edutic: Pendidikan dan Informatika, 7(1), 40-50. https://doi.org/10.21107/edutic.v7i1.8849
Taylor, C. J., Pomberger, A., Felton, K. C., Grainger, R., Barecka, M., Chamberlain, T. W., Bourne, R. A., Johnson, C. N., & Lapkin, A. A. (2023). A Brief Introduction to Chemical Reaction Optimization. Chemical Reviews, 123(6), 3089–3126. https://doi.org/10.1021/acs.chemrev.2c00798
Theofani, G., & Sediyono, E. (2022). Multiple Linear Regression Analysis on Factors that Influence Employees Work Motivation. SinkrOn, 7(3), 791–798. https://doi.org/10.33395/sinkron.v7i3.11453
Wang, H.-B., & Gao, Y.-J. (2021). Research on C4.5 algorithm improvement strategy based on MapReduce. Procedia Computer Science, 183, 160–165. https://doi.org/10.1016/j.procs.2021.02.045
Author Biographies
Muhsi, Universitas Islam Madura
Department of Information System
Suprapto, Universitas Negeri Yogyakarta
Department of Electronics and Informatics Education
Rofiuddin, Universitas Islam Madura
Department of Informatics Engineering
License
Copyright (c) 2023 Muhsi, Suprapto, Rofiuddin
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with Jurnal Penelitian Pendidikan IPA, agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution 4.0 International License (CC-BY License). This license allows authors to use all articles, data sets, graphics, and appendices in data mining applications, search engines, web sites, blogs, and other platforms by providing an appropriate reference. The journal allows the author(s) to hold the copyright without restrictions and will retain publishing rights without restrictions.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in Jurnal Penelitian Pendidikan IPA.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).