Student Flowchart Automated Evaluation for Scalable Assessment in Introductory Programming
DOI:
10.29303/jppipa.v11i12.13594Published:
2025-12-31Downloads
Abstract
This study evaluates the Automated Flowchart Assessment Tool (AFAT) to overcome limitations in semantic sensitivity and layout robustness prevalent in existing tools. Through a quantitative analysis of 312 student submissions, AFAT demonstrated superior diagnostic performance with a Micro-F1 score of 0.92 and substantial inter-rater agreement (Fleiss' Kappa = 0.88), supporting the hypothesis of expert-level accuracy. Key findings reveal that AFAT significantly enhances operational efficiency, reducing evaluation time by 61.2% (averaging 1.87 minutes per flowchart) while decreasing inter-rater variability by 28%. Generalized Linear Model (GLM) analysis confirmed significant time savings, particularly in high-complexity sessions (Wald χ² = 87.44, p < 0.001). Beyond technical efficiency, this research contributes to applied science education by providing a scalable framework for computational science literacy, enabling the rigorous assessment of algorithmic thinking within integrated STEM curricula. These results substantiate AFAT’s potential for large-scale deployment as a robust tool for automated scoring in formal educational settings
Keywords:
Diagnostic Accuracy Evaluation Efficiency Scoring Reliability Flowchart Assessment Semantic RobustnessReferences
Ariyanta, N. D., Prasetya, D. D., Ari, I., Zaeni, E., Wicaksono, R., & Hirashima, T. (2025). Assessing the Semantic Alignment in Multilingual Student-Teacher Concept Maps Using mBERT. 25(1), 113–126. https://doi.org/10.30812/matrik.v25i1.5046
Calderon, K., Serrano, N., Blanco, C., & Gutierrez, I. (2023). Automated and continuous assessment implementation in a programming course. Computer Applications in Engineering Education, 32. https://doi.org/10.1002/cae.22681
Chen, Z., Villar, S., Chen, L., & Bruna, J. (2019). On the equivalence between graph isomorphism testing and function approximation with GNNs. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates Inc.
Chowdhury, T., Contractor, M. R., & Rivero, C. (2024). Flexible Control Flow Graph Alignment for Delivering Data-Driven Feedback to Novice Programming Learners. J. Syst. Softw., 210, 111960. https://doi.org/10.1016/j.jss.2024.111960
Cui, H., Xie, M., Su, T., Zhang, C., & Tan, S. H. (2024). An Empirical Study of False Negatives and Positives of Static Code Analyzers From the Perspective of Historical Issues. 1(1), 1–26. http://arxiv.org/abs/2408.13855
Dikici, S., & Bilgin, T. T. (2025). Advancements in automated program repair: a comprehensive review. Knowledge and Information Systems, 67(6), 4737–4783. https://doi.org/10.1007/s10115-025-02383-9
Florou, C., Stamoulis, G., Xenakis, A., & Plageras, A. (2024). The role of educators in facilitating students’ self-assessment in learning computer programming concepts: addressing students’ challenges and enhancing learning. Educ. Inf. Technol., 30, 8567–8590. https://doi.org/10.1007/s10639-024-13172-2
Gambo, I., Abegunde, F.-J., Gambo, O., Ogundokun, R., Babatunde, A., & Lee, C. (2024). GRAD-AI: An automated grading tool for code assessment and feedback in programming course. Educ. Inf. Technol., 30, 9859–9899. https://doi.org/10.1007/s10639-024-13218-5
Geetika, Kaur, N., & Kaur, A. (2025). A Semantic-driven approach to detect Type-4 code clones by using AST and PDG. International Journal of Information Technology. https://doi.org/10.1007/s41870-025-02670-2
Huang, A., Lin, C., Su, S., & Yang, S. (2025). The impact of GenAI‐enabled coding hints on students’ programming performance and cognitive load in an SRL‐based Python course. British Journal of Educational Technology. https://doi.org/10.1111/bjet.13589
Huang, C., Fu, L., Hung, S., & Yang, S. (2025). Effect of Visual Programming Instruction on Students’ Flow Experience, Programming Self‐Efficacy, and Sustained Willingness to Learn. Journal of Computer Assisted Learning. https://doi.org/10.1111/jcal.13110
Kinnear, G., Jones, I., & Davies, B. (2025). Comparative judgement as a research tool: A meta-analysis of application and reliability. Behavior Research Methods, 57. https://doi.org/10.3758/s13428-025-02744-w
Lee, H.-Y., Lin, C.-J., Wang, W.-S., Chang, W., & Huang, Y.-M. (2023). Precision education via timely intervention in K-12 computer programming course to enhance programming skill and affective-domain learning objectives. International Journal of STEM Education, 10, 1–19. https://doi.org/10.1186/s40594-023-00444-5
Messer, M., Brown, N. C. C., Kölling, M., & Shi, M. (2024). Automated Grading and Feedback Tools for Programming Education: A Systematic Review. ACM Trans. Comput. Educ., 24(1). https://doi.org/10.1145/3636515
Pedagogy, M. (n.d.). Ontology Design of a Modern Learning Environment and Modern Pedagogy Using Protégé Software *. https://doi.org/10.30762/ijomer.v2i1.2755
Prasetya, D. D., Pinandito, A., Hayashi, Y., & Hirashima, T. (2022). Analysis of quality of knowledge structure and students’ perceptions in extension concept mapping. Research and Practice in Technology Enhanced Learning, 17(1). https://doi.org/10.1186/s41039-022-00189-9
Prasetya, D. D., Widiyaningtyas, T., & Hirashima, T. (2025). Interrelatedness patterns of knowledge representation in extension concept mapping. 1–18.
Pratama, W. S., Prasetya, D. D., Widyaningtyas, T., Wiryawan, M. Z., & Rady, L. G. (2025). Performance Evaluation of Artificial Intelligence Models for Classification in Concept Map Quality Assessment. 24(3), 407–422. https://doi.org/10.30812/matrik.v24i3.4729
Sakulin, S., Alfimtsev, A., & Kalgin, Y. (2025). Improvement of Computer Science Student’s Online Search by Metacognitive Instructions. Emerging Science Journal. https://doi.org/10.28991/esj-2025-sied1-03
Tong, Y., Schunn, C., & Wang, H. (2023). Why increasing the number of raters only helps sometimes: Reliability and validity of peer assessment across tasks of different complexity. Studies in Educational Evaluation. https://doi.org/10.1016/j.stueduc.2022.101233
Ulfa, S., Bringula, R., & An, R. (2025). An adaptive assessment : Online summary with automated feedback as a self-assessment tool in MOOCs environments Recommended citation : An adaptive assessment : Online summary with automated feedback as a self-assessment tool in MOOCs environments Saida Ulfa * Ence Surahman Agus Wedi Izzul Fatawi Rex Bringula. 17(1), 88–113.
Weegar, R., & Idestam-almquist, P. (2023). Reducing Workload in Short Answer Grading Using Machine Learning. International Journal of Artificial Intelligence in Education, 34(2), 1–27. https://doi.org/10.1007/s40593-022-00322-1
Weingarden, M., & Heyd-Metzuyanim, E. (2023). Evaluating mathematics lessons for cognitive demand: Applying a discursive lens to the process of achieving inter-rater reliability. Journal of Mathematics Teacher Education, 1–26. https://doi.org/10.1007/s10857-023-09579-2
Xu, X., Cao, Y., Hu, H., Xiang, H., Qi, L., Xiong, J., & Dou, W. (2025). MGF-ESE: An Enhanced Semantic Extractor with Multi-Granularity Feature Fusion for Code Summarization. In WWW 2025 - Proceedings of the ACM Web Conference (Vol. 1, Issue 1). Association for Computing Machinery. https://doi.org/10.1145/3696410.3714544
Ye, H., Liang, B., Ng, O.-L., & Chai, C. (2023). Integration of computational thinking in K-12 mathematics education: a systematic review on CT-based mathematics instruction and student learning. International Journal of STEM Education, 10, 1–26. https://doi.org/10.1186/s40594-023-00396-w
Zimmerman, A., King, E., & Bose, D. (2023). Effectiveness and utility of flowcharts on learning in a classroom setting: A mixed methods study. American Journal of Pharmaceutical Education, 100591. https://doi.org/10.1016/j.ajpe.2023.100591
License
Copyright (c) 2025 Usman Nurhasan, Didik Dwi Prasetya

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with Jurnal Penelitian Pendidikan IPA, agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution 4.0 International License (CC-BY License). This license allows authors to use all articles, data sets, graphics, and appendices in data mining applications, search engines, web sites, blogs, and other platforms by providing an appropriate reference. The journal allows the author(s) to hold the copyright without restrictions and will retain publishing rights without restrictions.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in Jurnal Penelitian Pendidikan IPA.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).






