TEACHING AND LEARNING TO CONSTRUCT DATA-BASED DECISION TREES USING DATA CARDS AS THE FIRST INTRODUCTION TO MACHINE LEARNING IN MIDDLE SCHOOL

Authors

DOI:

https://doi.org/10.52041/serj.v23i1.450

Keywords:

Statistics education research, Data science education, Decision trees, Machine learning, Artificial intelligence, Middle school

Abstract

This study investigates how 11- to 12-year-old students construct data-based decision trees using data cards for classification purposes. We examine the students' heuristics and reasoning during this process. The research is based on an eight-week teaching unit during which students labeled data, built decision trees, and assessed them using test data. They learned to manually construct decision trees to classify food items as recommendable or not. They utilized data cards with a heuristic that is a simplified form of a machine learning algorithm. We report on evidence that this topic is teachable to middle school students, along with insights for refining our teaching approach and broader implications for teaching machine learning at the school level.

References

Bargagliotti, A., Franklin, C., Arnold, P., Gould, R., Johnson, S., Perez, L., & Spangler, D. A. (2020). Pre-K–12 guidelines for assessment and instruction in statistics education II (GAISE II): A framework for statistics and data science education (2nd ed.). American Statistical Association.

Biehler, R., & Fleischer, Y. (2021). Introducing students to machine learning with decision trees using CODAP and Jupyter notebooks. Teaching Statistics, 43(S1), 133–142. https://doi.org/10.1111/test.12279

Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Taylor & Francis. https://doi.org/10.1201/9781315139470

Casal-Otero, L., Catala, A., Fernández-Morante, C., Taboada, M., Cebreiro, B., & Barro, S. (2023). AI literacy in K–12: A systematic literature review. International Journal of STEM Education, 10(1), Article 29. https://doi.org/10.1186/s40594-023-00418-7

Chevallard, Y., & Bosch, M. (2014). Didactic transposition in mathematics education. In S. Lerman (Ed.), Encyclopedia of mathematics education (pp. 170–174). Springer Netherlands.

Engel, J. (2017). Statistical literacy for active citizenship: A call for data science education. Statistics Education Research Journal, 16(1), 44–49. https://doi.org/10.52041/serj.v16i1.213

Engel, J., Erickson, T., & Martignon, L. (2018). Teaching and learning about tree-based methods for exploratory data analysis. In M. A. Sorto, A. White, & L. Guyot (Eds.), Looking back, looking forward. Proceedings of the tenth international conference on teaching statistics (ICOTS10, July 2018), Kyoto, Japan. International Statistical Institute.

Ericsson, K. A., & Simon, H. A. (1980). Verbal reports as data. Psychological Review, 87(3), 215–251. https://doi.org/10.1037/0033-295X.87.3.215

Fleischer, Y., Biehler, R., & Schulte, C. (2022). Teaching and learning data-driven machine learning with educationally designed Jupyter notebooks. Statistics Education Research Journal, 21(2), Article 7. https://doi.org/10.52041/serj.v21i2.61

Futschek, G., & Moschitz, J. (2010). Developing algorithmic thinking by inventing and playing algorithms. In J. E. Clayson & I. Kalas (Eds.), Constructionist approaches to creative learning, thinking and education: Lessons for the 21st century (Proceedings of Constructionism 2010, pp. 1–10). Comenius University. https://publik.tuwien.ac.at/files/PubDat_187461.pdf

Garfield, J. B., & Ben-Zvi, D. (2008). Developing students’ statistical reasoning: Connecting research and teaching practice. Springer.

Hancock, S. A., & Rummerfield, W. (2020). Simulation methods for teaching sampling distributions: Should hands-on activities precede the computer? Journal of Statistics Education, 28(1), 9–17. https://doi.org/10.1080/10691898.2020.1720551

Harradine, A., & Konold, C. (2006). How representational medium affects the data displays students make. In A. Rossman & B. Chance (Eds.), Working co-operatively in statistics education. Proceedings of the seventh international conference on teaching statistics. International Statistical Institute.

Hitron, T., Orlev, Y., Wald, I., Shamir, A., Erel, H., & Zuckerman, O. (2019). Can children understand machine learning concepts? The effect of uncovering black boxes. In A. Cox & V. Kostakos (Eds.), Proceedings of the 2019 CHI conference on human factors in computing systems (Paper 415). Association for Computing Machinery. https://doi.org/10/ghnn97

International Data Science in Schools Project Curriculum Team. (2019). Curriculum frameworks for introductory data science. http://idssp.org/files/IDSSP_Frameworks_1.0.pdf

Kauermann, G., & Küchenhoff, H. (2011). Stichproben: Methoden und praktische Umsetzung mit R [Samples: Methods and practical implementation with R]. Springer . https://doi.org/10.1007/978-3-642-12318-4

Kim, K., Kwon, K., Ottenbreit-Leftwich, A., Bae, H., & Glazewski, K. (2023). Exploring middle school students’ common naive conceptions of artificial intelligence concepts, and the evolution of these ideas. Education and Information Technologies, 28(8), 9827–9854. https://doi.org/10.1007/s10639-023-11600-3

Lakoff, G., & Núñez, R. E. (2000). Where mathematics comes from: How the embodied mind brings mathematics into being. Basic Books.

Long, D., & Magerko, B. S. (2020). What is AI literacy? Competencies and design considerations. In J. McGrenere, Andy Cockburn, I. Avellino, & A. Goguey (Eds.), Proceedings of the 2020 CHI conference on human factors in computing systems (Paper 598). Association for Computing Machinery. https://doi.org/10/ghbz2q

Martins, R. M., & Gresse von Wangenheim, C. (2022). Findings on teaching machine learning in high school: A ten-year systematic literature review. Informatics in Education, 22(3), Article 4. https://doi.org/10.15388/infedu.2023.18

Mayring, P. (2015). Qualitative content analysis: Theoretical background and procedures. In A. Bikner-Ahsbahs, C. Knipping, & N. Presmeg (Eds.), Approaches to qualitative research in mathematics education (pp. 365–380). Springer. https://doi.org/10.1007/978-94-017-9181-6_13

Mike, K., & Hazzan, O. (2022). Machine learning for non-majors: A white box approach. Statistics Education Research Journal, 21(2), Article 10. https://doi.org/10.52041/serj.v21i2.45

Mobasher, B., Dettori, L., Raicu, D., Settimi, R., Sonboli, N., & Stettler, M. (2019). Data science summer academy for Chicago Public School students. ACM SIGKDD Explorations Newsletter, 21(1), 49–52. https://doi.org/10.1145/3331651.3331661

Ministerium für Schule und Bildung des Landes Nordrhein-Westfalen. (2021). Kernlehrplan für die Sekundarstufe I: Klasse 5 und 6 in Nordrhein-Westfalen Informatik [Core curriculum for secondary school I: Grades 5 and 6 in North Rhine-Westphalia computer science]. https://www.schulentwicklung.nrw.de/lehrplaene/lehrplan/256/si_kl5u6_if_klp_2021_07_01.pdf

Podworny, S., Fleischer, Y., & Hüsing, S. (2022). Grade 6 students’ perception and use of data-based decision trees. In S. A. Peters, L. Zapata-Cardona, F. Bonafini, & A. Fan (Eds.), Bridging the gap: Empowering and educating today’s learners in statistics. Proceedings of the 11th international conference on teaching statistics (ICOTS11, 2022), Rosario, Argentina. International Association for Statistical Education. https://doi.org/10.52041/iase.icots11.T2H3

Podworny, S., Fleischer, Y., Hüsing, S., Biehler, R., Frischemeier, D., Höper, L., & Schulte, C. (2021). Using data cards for teaching data based decision trees in middle school. In O. Seppälä & A. Peterson (Eds.), Koli calling '21: 21st Koli calling international conference on computing education research (pp. 1–3). Association for Computing Machinery. https://doi.org/10.1145/3488042.3489966

Quinlan, J. R. (1993). C4.5: Programs for machine learning. Morgan Kaufmann Publishers.

Ridgway, J. (2016). Implications of the data revolution for statistics education: The data revolution and statistics education. International Statistical Review, 84(3), 528–549. https://doi.org/10/f3q6f6

Santana, O. A., Sousa, B. A. d., Monte, S. R. S. d., Lima, M. L. d. F., & Silva, C. F. e. (2020). Deep learning practice for high school student engagement in STEM careers. In A. Cardoso, G. R. Alves, & T. Restivo (Eds.), Proceedings of the 2020 IEEE global engineering education conference (EDUCON) (pp. 164–169). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/EDUCON45650.2020.9125281

Schönbrodt, S., Camminady, T., & Frank, M. (2022). Mathematische Grundlagen der Künstlichen Intelligenz im Schulunterricht: Chancen für eine Bereicherung des Unterrichts in linearer Algebra [Mathematical foundations of artificial intelligence in school lessons: Opportunities for enrichment of teaching linear algebra]. Mathematische Semesterberichte, 69. https://doi.org/10.1007/s00591-021-00310-x

Schüller, K., Koch, H., & Rampelt, F. (2021). Data literacy charter. Stifterverband. https://www.stifterverband.org/sites/default/files/data-literacy-charter.pdf

Sulmont, E., Patitsas, E., & Cooperstock, J. R. (2019b). What is hard about teaching machine learning to non-majors? Insights from classifying instructors’ learning goals. ACM Transactions on Computing Education, 19(4), Article 33. https://doi.org/10.1145/3336124

Vartiainen, H., Tedre, M., & Valtonen, T. (2020). Learning machine learning with very young children: Who is teaching whom? International Journal of Child-Computer Interaction, 25, Article 100182. https://doi.org/10/gjvbc9

Zieffler, A., Justice, N., delMas, R., & Huberty, M. D. (2021). The use of algorithmic models to develop secondary teachers’ understanding of the statistical modeling process. Journal of Statistics and Data Science Education, 29(1), 131–147. https://doi.org/10.1080/26939169.2021.1900759

Downloads

Published

2024-08-10

Issue

Section

Regular Articles