Pertanika Journal

Go to Pertanika

Go to JTAS Home

Go to Pertanika Facebook

Home / Regular Issue / JSSH Vol. 32 (1) Mar. 2024 / JSSH-8784-2022

Using Machine Learning to Score Multidimensional Assessments of Students’ Skill Levels in Mathematics

Doungruethai Chitaree, Putcharee Junpeng, Suphachoke Sonsilphong and Keow Ngang Tang

Pertanika Journal of Social Science and Humanities, Volume 32, Issue 1, March 2024

DOI: https://doi.org/10.47836/pjssh.32.1.10

Keywords: Construct modeling approach, machine learning, mathematical skill measurement model, Rasch model analysis, seventh-grade students

Published on: 19 March 2024

Abstract

This research aims to establish a mathematical skill measurement model to examine seventh-grade students’ mathematical skills in two aspects: their understanding of mathematical processes and the concept and structure. The researchers surveyed the mathematical skills of 521 seventh-grade students from the northeastern province of Thailand. Their test results were used to prototype a mathematical skill measurement model using machine learning. It involved a design-based approach that included four stages: a construct map, item design, a Wright Map, and outcome space, the so-called Multidimensional Random Coefficient Multinomial Logit Model, to verify its quality. The initial findings revealed the creation of a construct map consisting of five levels. The researchers determined the cut-off point in the form of the threshold level after considering the Wright Map criteria area for each aspect. Lastly, the measurement model was examined to provide adequate evidence of the internal structure’s validity and reliability. In conclusion, students’ skill levels can be measured accurately using multidimensional assessments, even though the levels of mathematical capabilities of the students varied from low to moderate to high. Therefore, it provides significant evidence of the mathematical skill measurement model to diagnose seventh-grade students’ learning. The significant implications contributed to educational measurement and evaluation are that machine learning algorithms can provide more accurate and consistent scoring of assessments compared to human graders. With accurate assessment using machine learning, teachers can gain deeper insights into individual students’ mathematical skills across multiple dimensions.

References

Adams, R. J., Wilson, M., & Wang, W. (1997). The multidimensional random coefficient multinomial logit model. Applied Psychological Measurement, 21(1), 1-23. https://doi.org/10.1177/0146621697211001
Alfayez, M. Q. E. (2022). Mathematical proficiency among female teachers of the first three grades in Jordan and its relationship to their mathematical thinking. Frontiers in Education, 7. Article 957923. https://doi.org/10.3389/feduc.2022.957923
Briggs, J. B., & Collis, K. (1982). Evaluating the quality of learning: The SOLO taxonomy. Academic Press. https://doi.org/10.1016/C2013-0-10375-3
Chinjunthuk, S., Junpeng, P., & Tang, K. N. (2022). Use of digital learning platform in diagnosing seventh grade students’ mathematical ability levels. Journal of Education and Learning, 11(3), 95-104. https//doi.org/10.5539/jel.v11n3p95
Corrêa, P. D., & Haslam, D. (2021). Mathematical proficiency as the basis for assessment: A literature review and its potentialities. Mathematics Teaching Research Journal, 12(4), 3-20.
Craig, O. (2021, June 29). What is STEM? https://www.topuniversities.com/courses/engineering/what-stem
Embretson, S. E. (2015). The multicomponent latent trait model for diagnosis: Applications to heterogeneous test domains.Applied Psychological Measurement, 39(1), 16-30. https://doi.org/10.1177/0146621614552014
Harris, C. J., Krajcik, J. S., Pellegrino, J. W., & DeBarger, A. H. (2019). Designing knowledge-in-use assessments to promote deeper learning. Educational Measurement: Issues and Practice, 38(2), 53-67. https://doi.org/10.1111/emip.12253
Howell, E., & Walkington, C. (2020). Factors associated with completion: Pathways through developmental mathematics. Journal of College Student Retention: Research, Theory & Practice, 24(1), 43-78. https://doi.org/10.1177/1521025119900985
Inprasitha, M. (2022). Lesson study and open approach development in Thailand: A longitudinal study. International Journal for Lesson and Learning Studies, 11(5), 1-15. https://doi.org/10.1108/IJLLS-04-2021-0029
Junpeng, P., Inprasitha, M., & Wilson, M. (2018). Modeling of the open-ended items for assessing multiple proficiencies in mathematical problem solving. The Turkish Online Journal of Educational Technology, 2, 142-149.
Junpeng, P., Marwiang, M., Chiajunthuk, S., Suwannatrai, P., Chanayota, K., Pongboriboon, K., Tang, K. N., & Wilson, M. (2020). Validation of a digital tool for diagnosing mathematical proficiency. International Journal of Evaluation and Research in Education, 9(3), 665-674. http://doi.org/10.11591/ijere.v9i3.20503
Leyva, E., Walkington, C., & Perera, H. (2022). Making mathematics relevant: An examination of student interest in mathematics, interest in STEM careers, and perceived relevance. International Journal of Research in Undergraduate Mathematics Education, 8, 612-641. https://doi.org/10.1007/s40753-021-00159-4
Maestrales, S., Zhai, X., Touitou, I., Baker, Q., Schneider, B., & Krajcik, J. (2021). Using machine learning to score multi-dimensional assessments of Chemistry and Physics. Journal of Science Education and Technology, 30, 239-254. https://doi.org/10.1007/s10956-020-09895-9
Organization for Economic Cooperation and Development. (2019). PISA 2018 results: What students know and can do. PISA OECD Publishing. https://doi.org/10.1787/5f07c754-en
Phaniew, S., Junpeng, P., & Tang, K.N. (2021). Designing standards-setting for levels of mathematical proficiency in measurement and geometry: Multidimensional item response model. Journal of Education and Learning, 10(6), 103-111. https//doi.org/10.5539/jel.v10n6p103
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. The University of Chicago Press. https://doi.org/10.2307/2287805
Thailand Ministry of Education (2017). Learning standards and indicators learning of mathematics (revised edition 2017) according to the Core Curriculum of Basic Education, B. E. 2551. Agricultural Cooperative of Thailand. https://drive.google.com/file/d/1F4_wAe-ZF13-WhvnEAupXNiWchvpcQKW/view
Vongvanich, S. (2020). Design research in education. Chulalongkorn University Printing House.
Webb, N. L. (1997). Criteria for alignment of expectations and assessments in mathematics and science education. Council of Chief State School Officers. https://www.researchgate.net/publication/234731918_Criteria_for_Alignment_of_Expectations_and_Assessments_in_Mathematics_and_Science_Education_Research_Monograph_No_6
Wilson, C. D., Haudek, K. C., Osborne, J. F., Bracey, Z. E. B., Cheuk, T., Donovan, B. M., Stuhlsatz, M. A. M., Santiago, M. M., & Zhai. X. (2024). Using automated analysis to assess middle school students’ competence with scientific argumentation. Journal of Research in Science Teaching, 61(1), 38-69. https://doi.org/10.1002/tea.21864
Wilson, M. (2005). Constructing measures: An item response modeling approach. Lawrence Erlbaum Assoc. https://doi.org/10.4324/9781410611697
Wilson, M., & Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 13(2), 181-208. https://doi.org/10.1207/S15324818AME1302_4
Wright, B. D., & Stone, M. H. (1979). Best test design: Rasch measurement. Mesa Press. https://research.acer.edu.au/measurement/1/
Wu, M. L., Adams, R. J., Wilson, M. R., & Haldane, S. A. (2007). ACERConQuest version 2: Generalized item response modeling software. ACER Press. https://www.researchgate.net/publication/262187496_ConQuest_Version_2_Generalised_Item_Response_Modelling_Software
Zhai, X., Haudek, K. C., Shi, L., Nehm, R., & Urban-Lurain, M. (2020). From substitution to redefinition: A framework of machine learning-based science assessment. Journal of Research in Science Teaching, 57(9), 1430-1459. https://doi.org/10.1002/tea.21658