Multimethodology Analysis of Determinants of Breast Cancer Diagnosis Machine Learning
DOI:
10.29303/jppipa.v12i1.12497Published:
2026-01-31Downloads
Abstract
Breast cancer remains one of the most prevalent and life-threatening diseases worldwide, highlighting the urgent need for accurate and interpretable diagnostic models. While machine learning has shown promise in classification tasks, many existing models lack transparency and overlook the individual contribution of cellular features essential for clinical decision-making.This study proposes an integrative and explainable framework to identify the most influential cellular-level features in distinguishing between benign and malignant breast tumors. Using a publicly available dataset comprising 569 observations and 32 numerical features, we conducted a multi-step analysis. Feature relevance was first evaluated using Pearson correlation. Random Forest and Recursive Feature Elimination (RFE) were employed to rank and refine the feature subset, followed by Principal Component Analysis (PCA) for dimensionality reduction and pattern visualization. SHapley Additive exPlanations (SHAP) were utilized to interpret individual predictions. Complementary statistical tests, including t-tests and chi-square analyses, assessed associations between tumor characteristics and diagnosis. A logistic regression model was developed to evaluate predictive performance.Key cellular features—such as mean radius, texture, and concavity—were consistently identified as highly predictive of diagnosis. RFE demonstrated that fewer than 10 features were sufficient for optimal classification. The logistic regression model achieved high accuracy, offering a simpler yet effective alternative for prediction.By combining statistical methods with interpretable machine learning, this study presents a transparent and clinically relevant approach to breast cancer diagnosis. The integration of SHAP values bridges the gap between model performance and interpretability, supporting more informed and personalized clinical decisions. Future work should consider external validation, image-based features, and patient demographic variables to enhance generalizability.
Keywords:
Breast cancer Feature selection Interpretable machine learning SHAPReferences
Al Mudawi, N., & Alazeb, A. (2022). A Model For Predicting Cervical Cancer Using Machine Learning Algorithms. Sensors, 22(11), 4132. https://doi.org/10.3390/S22114132
Alcaraz, K. I., Wiedt, T. L., Daniels, E. C., Yabroff, K. R., Guerra, C. E., & Wender, R. C. (2020). Understanding And Addressing Social Determinants To Advance Cancer Health Equity In The United States: A Blueprint For Practice, Research, And Policy. Ca: A Cancer Journal For Clinicians, 70(1), 31–46. https://doi.org/10.3322/caac.21586
Ampofo, A. G., Boyes, A. W., Asibey, S. O., Oldmeadow, C., & Mackenzie, L. J. (2023). Prevalence And Correlates Of Modifiable Risk Factors For Cervical Cancer And Hpv Infection Among Senior High School Students In Ghana: A Latent Class Analysis. Bmc Public Health, 23(1), 340. https://doi.org/10.1186/s12889-022-14908-w
Braun, M., Klingelhöfer, D., Oremek, G. M., Quarcoo, D., & Groneberg, D. A. (2020). Influence Of Second-Hand Smoke And Prenatal Tobacco Smoke Exposure On Biomarkers, Genetics And Physiological Processes In Children—An Overview In Research Insights Of The Last Few Years. International Journal Of Environmental Research And Public Health, 17(9), 3212. https://doi.org/10.3390/ijerph17093212
Cameron, A. R., Meyer, A., Faverjon, C., & Mackenzie, C. (2020). Quantification Of The Sensitivity Of Early Detection Surveillance. Transboundary And Emerging Diseases, 67(6), 2532–2543. https://doi.org/10.1111/tbed.13598
Campos, N. G., Demarco, M., Bruni, L., Desai, K. T., Gage, J. C., Adebamowo, S. N., De Sanjose, S., Kim, J. J., & Schiffman, M. (2021). A Proposed New Generation Of Evidence-Based Microsimulation Models To Inform Global Control Of Cervical Cancer. Preventive Medicine, 144, 106438. https://doi.org/10.1016/j.ypmed.2021.106438
Casas, C. P. R., Albuquerque, R. De C. R. De, Loureiro, R. B., Gollner, A. M., Freitas, M. G. De, Duque, G. P. Do N., & Viscondi, J. Y. K. (2022). Cervical Cancer Screening In Low-And Middle-Income Countries: A Systematic Review Of Economic Evaluation Studies. Clinics, 77, 100080. https://doi.org/10.1016/j.clinsp.2022.100080
Chisale Mabotja, M., Levin, J., & Kawonga, M. (2021). Beliefs And Perceptions Regarding Cervical Cancer And Screening Associated With Pap Smear Uptake In Johannesburg: A Cross-Sectional Study. Plos One, 16(2), E0246574. https://doi.org/10.1371/journal.pone.0246574
Davidović, M., Asangbeh, S. L., Taghavi, K., Dhokotera, T., Jaquet, A., Musick, B., Van Schalkwyk, C., Schwappach, D., Rohner, E., & Murenzi, G. (2024). Facility-Based Indicators To Manage And Scale Up Cervical Cancer Prevention And Care Services For Women Living With Hiv In Sub-Saharan Africa: A Three-Round Online Delphi Consensus Method. Jaids Journal Of Acquired Immune Deficiency Syndromes, 95(2), 170–178. https://doi.org/10.1097/QAI.0000000000003343
De Falco, S. (2012). The Discovery Of Placenta Growth Factor And Its Biological Activity. Experimental & Molecular Medicine 2012 44:1, 44(1), 1–9. https://doi.org/10.3858/Emm.2012.44.1.025
Dieli-Conwright, C. M., Courneya, K. S., Demark-Wahnefried, W., Sami, N., Lee, K., Sweeney, F. C., Stewart, C., Buchanan, T. A., Spicer, D., Tripathy, D., Bernstein, L., & Mortimer, J. E. (2018). Aerobic And Resistance Exercise Improves Physical Fitness, Bone Health, And Quality Of Life In Overweight And Obese Breast Cancer Survivors: A Randomized Controlled Trial 11 Medical And Health Sciences 1117 Public Health And Health Services. Breast Cancer Research, 20(1), 1–10. https://doi.org/10.1186/s13058-018-1051-6
Dykens, J. A., Peterson, C. E., Holt, H. K., & Harper, D. M. (2023). Gender Neutral Hpv Vaccination Programs: Reconsidering Policies To Expand Cancer Prevention Globally. Frontiers In Public Health, 11, 1067299. https://doi.org/10.3389/fpubh.2023.1067299
Ford, S., Tarraf, W., Williams, K. P., Roman, L. A., & Leach, R. (2021). Differences In Cervical Cancer Screening And Follow-Up For Black And White Women In The United States. Gynecologic Oncology, 160(2), 369–374. https://doi.org/10.1016/j.ygyno.2020.11.027
Gravitt, P. E., Silver, M. I., Hussey, H. M., Arrossi, S., Huchko, M., Jeronimo, J., Kapambwe, S., Kumar, S., Meza, G., Nervi, L., Paz-Soldan, V. A., & Woo, Y. L. (2021). Achieving Equity In Cervical Cancer Screening In Low- And Middle-Income Countries (Lmics): Strengthening Health Systems Using A Systems Thinking Approach. Preventive Medicine, 144, 106322. https://doi.org/10.1016/j.ypmed.2020.106322
Heidari Sarvestani, M., Khani Jeihooni, A., Moradi, Z., & Dehghan, A. (2021). Evaluating The Effect Of An Educational Program On Increasing Cervical Cancer Screening Behavior Among Women In Fasa, Iran. Bmc Women’s Health, 21, 1–8. https://doi.org/10.1186/s12905-021-01191-x
Islami, F., Guerra, C. E., Minihan, A., Yabroff, K. R., Fedewa, S. A., Sloan, K., Wiedt, T. L., Thomson, B., Siegel, R. L., Nargis, N., Winn, R. A., Lacasse, L., Makaroff, L., Daniels, E. C., Patel, A. V., Cance, W. G., & Jemal, A. (2022). American Cancer Society’s Report On The Status Of Cancer Disparities In The United States, 2021. Ca: A Cancer Journal For Clinicians, 72(2), 112–143. https://doi.org/10.3322/caac.21703
Ji, L., Chen, M., & Yao, L. (2023). Strategies To Eliminate Cervical Cancer In China. Frontiers In Oncology, 13, 1105468. https://doi.org/10.3389/fonc.2023.1105468
Johnson, A. J., Johnson, M. J., Williams, J. B., Muscari, E., Palmo, L., Ruiz, M., Bush, B., & Campbell, L. C. (2025). Cervical Cancer Prevention Behaviors In Young Black Women. Women’s Health, 21, 17455057251326008. https://doi.org/10.1177/17455057251326008
Kabassi, K., & Alepis, E. (2020). Learning Analytics In Distance And Mobile Learning For Designing Personalised Software. In Machine Learning Paradigms (Bll 185–203). Springer. https://doi.org/10.1007/978-3-030-13743-4_10
Kobryn, A., Nian, P., Baidya, J., Li, T. L., & Maheshwari, A. V. (2023). Intramedullary Nailing With And Without The Use Of Bone Cement For Impending And Pathologic Fractures Of The Humerus In Multiple Myeloma And Metastatic Disease. Cancers, 15(14), 3601. https://doi.org/10.3390/cancers15143601
Kumawat, G., Vishwakarma, S. K., Chakrabarti, P., Chittora, P., Chakrabarti, T., & Lin, J. C.-W. (2023). Prognosis Of Cervical Cancer Disease By Applying Machine Learning Techniques. Journal Of Circuits, Systems And Computers, 32(01), 2350019. https://doi.org/10.1142/s0218126623500196
Li, G., Gong, S., Wang, N., & Yao, X. (2022). Toxic Epidermal Necrolysis Induced By Sintilimab In A Patient With Advanced Non-Small Cell Lung Cancer And Comorbid Pulmonary Tuberculosis: A Case Report. Frontiers In Immunology, 13, 989966. https://doi.org/10.3389/fimmu.2022.989966
Lilhore, U. K., Poongodi, M., Kaur, A., Simaiya, S., Algarni, A. D., Elmannai, H., Vijayakumar, V., Tunze, G. B., & Hamdi, M. (2022). Hybrid Model For Detection Of Cervical Cancer Using Causal Analysis And Machine Learning Techniques. Computational And Mathematical Methods In Medicine, 2022(1), 4688327. https://doi.org/10.1155/2022/4688327
Liu, G., Mugo, N. R., Bayer, C., Rao, D. W., Onono, M., Mgodi, N. M., Chirenje, Z. M., Njoroge, B. W., Tan, N., & Bukusi, E. A. (2022). Impact Of Catch-Up Human Papillomavirus Vaccination On Cervical Cancer Incidence In Kenya: A Mathematical Modeling Evaluation Of Hpv Vaccination Strategies In The Context Of Moderate Hiv Prevalence. Eclinicalmedicine, 45. Retrieved from https://www.thelancet.com/journals/eclinm/article/PIIS2589-5370(22)00036-0/fulltext
Malik, M., Parveen Kiyani, I., Rana, S., Hussain, A., & Bin Aslam Zahid, M. (2021). Quality Of Life And Psychological Distress During Cancer: A Prospective Observational Study Involving Liver Cancer Patients. Retrieved from http://libraryaplos.com/xmlui/handle/123456789/6325
Mcguire, A., Brown, J. A. L., Malone, C., Mclaughlin, R., & Kerin, M. J. (2015). Effects Of Age On The Detection And Management Of Breast Cancer. Cancers, 7(2), 908–929. Https://Doi.Org/10.3390/Cancers7020815
Miake-Lye, I. M., Mak, S., Lee, J., Luger, T., Taylor, S. L., Shanman, R., Beroes-Severin, J. M., & Shekelle, P. G. (2019). Massage For Pain: An Evidence Map. In Journal Of Alternative And Complementary Medicine. https://doi.org/10.1089/acm.2018.0282
Nougaret, S., Addley, H., Sala, E., & Sahdev, A. (2020). Ovarian Cancer 19. Husband & Reznek’s Imaging In Oncology, 378. CRC Press.
Obol, J. H., Lin, S., Obwolo, M. J., Harrison, R., & Richmond, R. (2021). Knowledge, Attitudes, And Practice Of Cervical Cancer Prevention Among Health Workers In Rural Health Centres Of Northern Uganda. Bmc Cancer, 21, 1–15. https://doi.org/10.1186/s12885-021-07847-z
Oršolić, D., Pehar, V., Šmuc, T., & Stepanić, V. (2021). Comprehensive Machine Learning Based Study Of The Chemical Space Of Herbicides. Scientific Reports, 11(1), 11479. https://doi.org/10.1038/s41598-021-90690-w
Osaili, T. M., Dhanasekaran, D. K., Zeb, F., Faris, M. E., Naja, F., Radwan, H., Cheikh Ismail, L., Hasan, H., Hashim, M., & Obaid, R. S. (2023). A Status Review On Health-Promoting Properties And Global Regulation Of Essential Oils. Molecules, 28(4), 1809. https://doi.org/10.3390/molecules28041809
Pacal, I. (2024). Maxcervixt: A Novel Lightweight Vision Transformer-Based Approach For Precise Cervical Cancer Detection. Knowledge-Based Systems, 289, 111482. https://doi.org/10.1016/j.knosys.2024.111482
Pieters, M. M., Proeschold-Bell, R. J., Coffey, E., Huchko, M. J., & Vasudevan, L. (2021). Knowledge, Attitudes, And Practices Regarding Cervical Cancer Screening Among Women In Metropolitan Lima, Peru: A Cross-Sectional Study. Bmc Women’s Health, 21, 1–13. https://doi.org/10.1186/s12905-021-01431-0
Poltavets, V., Kochetkova, M., Pitson, S. M., & Samuel, M. S. (2018). The Role Of The Extracellular Matrix And Its Molecular And Cellular Regulators In Cancer Cell Plasticity. Frontiers In Oncology. https://doi.org/10.3389/fonc.2018.00431/bibtex
Pramanik, R., Biswas, M., Sen, S., Souza Júnior, L. A. De, Papa, J. P., & Sarkar, R. (2022). A Fuzzy Distance-Based Ensemble Of Deep Models For Cervical Cancer Detection. Computer Methods And Programs In Biomedicine, 219, 106776. https://doi.org/10.1016/j.cmpb.2022.106776
Rock, C. L., Thomson, C., Gansler, T., Gapstur, S. M., Mccullough, M. L., Patel, A. V, Andrews, K. S., Bandera, E. V, Spees, C. K., Robien, K., Hartman, S., Sullivan, K., Grant, B. L., Hamilton, K. K., Kushi, L. H., Caan, B. J., Kibbe, D., Black, J. D., Wiedt, T. L., … Doyle, C. (2020). American Cancer Society Guideline For Diet And Physical Activity For Cancer Prevention. Ca: A Cancer Journal For Clinicians, 70(4), 245–271. https://doi.org/10.3322/caac.21591
Rompis, K., Wowor, V. N. S., & Pangemanan, D. H. C. (2019). Tingkat Pengetahuan Bahaya Merokok Bagi Kesehatan Gigi Mulut Pada Siswa Smk Negeri 8 Manado. E-Clinic, 7(2). https://doi.org/10.35790/ecl.v7i2.24023
Sandra, L., Marcel, Gunarso, G., Fredicia, & Riruma, O. W. (2022). Are University Students Independent: Twitter Sentiment Analysis Of Independent Learning In Independent Campus Using Roberta Base Indolem Sentiment Classifier Model. 2021 International Seminar On Machine Learning, Optimization, And Data Science (Ismode), 249–253. https://doi.org/10.1109/ismode53584.2022.9743110
Shields, H. J., Traa, A., & Van Raamsdonk, J. M. (2021). Beneficial And Detrimental Effects Of Reactive Oxygen Species On Lifespan: A Comprehensive Review Of Comparative And Experimental Studies. Frontiers In Cell And Developmental Biology, 9, 628157. https://doi.org/10.3389/fcell.2021.628157
Shoghi, M., Shahbazi, B., & Seyedfatemi, N. (2019). The Effect Of The Family-Centered Empowerment Model (Fcem) On The Care Burden Of The Parents Of Children Diagnosed With Cancer. Asian Pacific Journal Of Cancer Prevention, 20(6), 1757–1764. https://doi.org/10.31557/apjcp.2019.20.6.1757
Shtar, G., Rokach, L., Shapira, B., Nissan, R., & Hershkovitz, A. (2021). Using Machine Learning To Predict Rehabilitation Outcomes In Postacute Hip Fracture Patients. Archives Of Physical Medicine And Rehabilitation, 102(3), 386–394. https://doi.org/10.1016/j.apmr.2020.08.011
Soong, T. R., Dinulescu, D. M., Xian, W., & Crum, C. P. (2018). Frontiers In The Pathology And Pathogenesis Of Ovarian Cancer: Cancer Precursors And" Precursor Escape". Hematology/Oncology Clinics Of North America, 32(6), 915–928. Retrieved from https://www.sciencedirect.com/science/article/pii/S0889858818307639
Soto, M. L. Q., Guillén, J. C., Aguayo, J. M. B., Valdes, J. H., Ruíz, G. B., Morales, F. E., Sanchez, A. S., Campas, C. Y. Q. C., Ornelas, R. M. R., & González, M. Del R. M. (2023). Adherence Model To Cervical Cancer Treatment In The Covid-19 Era. Baghdad Science Journal, 20(4 (Si)), 1559–1569. Retrieved from https://bsj.uobaghdad.edu.iq/home/vol20/iss4/26/
Spencer, J. C., Brewer, N. T., Coyne-Beasley, T., Trogdon, J. G., Weinberger, M., & Wheeler, S. B. (2021). Reducing Poverty-Related Disparities In Cervical Cancer: The Role Of Hpv Vaccination. Cancer Epidemiology, Biomarkers & Prevention, 30(10), 1895–1903. https://doi.org/10.1158/1055-9965.epi-21-0307
Sukma, D. I., Prabowo, H. A., Setiawan, I., Kurnia, H., & Fahturizal, I. M. (2022). Implementation Of Total Productive Maintenance To Improve Overall Equipment Effectiveness Of Linear Accelerator Synergy Platform Cancer Therapy. International Journal Of Engineering, 35(7), 1246–1256. Retrieved from https://shorturl.asia/X289L
Takahashi, Y., Sone, K., Noda, K., Yoshida, K., Toyohara, Y., Kato, K., Inoue, F., Kukita, A., Taguchi, A., & Nishida, H. (2021). Automated System For Diagnosing Endometrial Cancer By Adopting Deep-Learning Technology In Hysteroscopy. Plos One, 16(3), E0248526. https://doi.org/10.1371/journal.pone.0248526
Tanaka, T., Shindo, T., Hashimoto, K., Kobayashi, K., & Masumori, N. (2022). Management Of Hydronephrosis After Radical Cystectomy And Urinary Diversion For Bladder Cancer: A Single Tertiary Center Experience. International Journal Of Urology, 29(9), 1046–1053. https://doi.org/10.1111/iju.14970
Triberti, S., Savioni, L., Sebri, V., & Pravettoni, G. (2019). Corrigendum To Ehealth For Improving Quality Of Life In Breast Cancer Patients: A Systematic Review. Cancer Treatment Reviews, 81, 1–14. https://doi.org/10.1016/j.ctrv.2019.101928
Uddin, N., Jaya, S., Purwanto, E., Putra, A. A. D., Fadhilah, M. W., & Ramadhan, A. L. R. (2022). Machine-Learning Prediction Of Informatics Students Interest To The Mbkm Program: A Study Case In Universitas Pembangunan Jaya. 2021 International Seminar On Machine Learning, Optimization, And Data Science (Ismode), 146–151. https://doi.org/10.1109/ismode53584.2022.9743125
Yang, C., Qin, L., Xie, Y., & Liao, J. (2022). Deep Learning In Ct Image Segmentation Of Cervical Cancer: A Systematic Review And Meta-Analysis. Radiation Oncology, 17(1), 175. https://doi.org/10.1186/s13014-022-02148-6
Young, C., & Argáez, C. (2020). Manual Therapy For Chronic Non-Cancer Back And Neck Pain: A Review Of Clinical Effectiveness. Manual Therapy For Chronic Non-Cancer Back And Neck Pain: A Review Of Clinical Effectiveness. Retrieved from https://europepmc.org/article/NBK/nbk562937
Yu, Z., Yang, X., Dang, C., Wu, S., Adekkanattu, P., Pathak, J., George, T. J., Hogan, W. R., Guo, Y., & Bian, J. (2022). A Study Of Social And Behavioral Determinants Of Health In Lung Cancer Patients Using Transformers-Based Natural Language Processing Models. Amia Annual Symposium Proceedings, 2021, 1225. Retrieved from https://pmc.ncbi.nlm.nih.gov/articles/PMC8861705/
Zahid Iqbal, M., & Campbell, A. G. (2023). Agilest Approach: Using Machine Learning Agents To Facilitate Kinesthetic Learning In Stem Education Through Real-Time Touchless Hand Interaction. Telematics And Informatics Reports, 9(December 2022), 100034. https://doi.org/10.1016/j.teler.2022.100034
Zhang, M., Sit, J. W. H., Chan, D. N. S., Akingbade, O., & Chan, C. W. H. (2022). Educational Interventions To Promote Cervical Cancer Screening Among Rural Populations: A Systematic Review. International Journal Of Environmental Research And Public Health, 19(11), 6874. https://doi.org/10.3390/ijerph19116874
Zhu, X., Xu, Q., Tang, M., Li, H., & Liu, F. (2018). A Hybrid Machine Learning And Computing Model For Forecasting Displacement Of Multifactor-induced landslides. Neural Computing and Applications, 30, 3825–3835. https://doi.org/10.1007/s00521-017-2968-x
Zhuang, J., & Guan, M. (2022). Modeling the mediating and moderating roles of risk perceptions, efficacy, desired uncertainty, and worry in information seeking-cancer screening relationship using HINTS 2017 data. Health Communication, 37(7), 897–908. https://doi.org/10.1080/10410236.2021.1876324
License
Copyright (c) 2026 Dita Anggriani Lubis, Yuli Irnawati, Ayu Trisni Pamilih, Ria Fazelita Br Gultom

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with Jurnal Penelitian Pendidikan IPA, agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution 4.0 International License (CC-BY License). This license allows authors to use all articles, data sets, graphics, and appendices in data mining applications, search engines, web sites, blogs, and other platforms by providing an appropriate reference. The journal allows the author(s) to hold the copyright without restrictions and will retain publishing rights without restrictions.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in Jurnal Penelitian Pendidikan IPA.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).






