Branching out data science education: Developing task and computational environment design principles for teaching data science at the high school level through an international research collaboration
DOI:
https://doi.org/10.52041/iase25.138Abstract
Data science is rapidly emerging as a desired component of education, yet generalisable design principles for tasks and computational environments remain underdeveloped. This paper reports on a design-based research collaboration between Germany and New Zealand, which aims to develop and refine such design principles for high school data science education. Drawing on literature and prior research, we propose that the design of tasks and computational environments should integrate humanistic, algorithmic, and programming approaches. The three key design “branches” of our research are: (1) emphasising a humanistic focus through data construction and exploration; (2) using decision trees to develop algorithmic modelling concepts; (3) using computer programming for gaining insights, supporting creativity, and model tinkering. These branches are illustrated through examples from our collaborative work. Our research contributes to the development of effective educational strategies for data science education and could inform the development of professional development for high school teachers.References
Biehler, R., & Fleischer, Y. (2021). Introducing students to machine learning with decision trees using CODAP and Jupyter Notebooks. Teaching Statistics, 43, 133–142. https://doi.org/10.1111/test.12279
Biehler, R., Frischemeier, D., Reading, C., & Shaughnessy, J. M. (2018). Reasoning about data. In D. Ben-Zvi, K. Makar, & J. Garfield (Eds.), International Handbook of Research in Statistics Education (pp. 139–192). Springer. https://doi.org/10.1007/978-3-319-66195-7_5
Budgett, S., & Puloka, M. (2019). Making sense of categorical data–question confusion. In S. Budgett (Eds.), Decision making based on data. Proceedings of the Satellite Conference of the International Association for Statistical Education, ISI/IASE.
Burrill, G., & Pfannkuch, M. (2024). Emerging trends in statistics education. ZDM–Mathematics Education, 56(1), 19–29. https://doi.org/10.1007/s11858-023-01501-7
Caetano, S-J., de Sousa, B., Fergusson, A., Le, L., Gibbs, A. L., White, B., & Damouras, S. (2023). Putting research into practice: Applying evidence-based principles to foster student learning in statistics and data science. In E. M. Jones (Ed.), Fostering learning of statistics and data science. Proceedings of the Satellite conference of the International Association for Statistical Education, ISI/IASE. https://doi.org/10.52041/iase2023.701
Fergusson, A. (2022). Towards an integration of statistical and computational thinking: Development of a task design framework for introducing code-driven tools through statistical modelling [Doctoral dissertation, ResearchSpace@ Auckland]. https://hdl.handle.net/2292/64664
Fergusson, A. (2023). Designing positive first experiences with coding for introductory-level data science students. In E. M. Jones (Ed.), Fostering learning of statistics and data science. Proceedings of the Satellite Conference of the International Association for Statistical Education, ISI/IASE. https://doi.org/10.52041/iase2023.503
Fergusson, A., & Bolton, L. (2018). Exploring modern data in a large introductory statistics course. In M. A. Sorto, A. White, & L. Guyot (Eds.), Looking back, looking forward. Proceedings of the 10th International Conference on Teaching Statistics, ISI/IASE.
Fergusson, A., & Pfannkuch, M. (2022a). Introducing high school statistics teachers to predictive modelling and APIs using code-driven tools. Statistics Education Research Journal, 21(2), 8. https://doi.org/10.52041/serj.v21i2.49
Fergusson, A., & Pfannkuch, M. (2022b). Introducing teachers who use GUI-driven tools for the randomization test to code-driven tools. Mathematical Thinking and Learning, 24(4), 336–356. https://doi.org/10.1080/10986065.2021.1922856
Fergusson, A., & Pfannkuch, M. (2024). Using grayscale photos to introduce high school statistics teachers to reasoning with digital image data. Journal of Statistics and Data Science Education, 32 (4), 345–360. https://doi.org/10.1080/26939169.2024.2351570
Fergusson, A., Pfannkuch, M., & Budgett, S. (2025). Data cleaning doesn’t happen in a vacuum: An initial exploration of high school statistics teachers’ data practices with messy data. In J. Kaplan & K. Luebke (Eds.), Connecting data and people for inclusive statistics and data science education. Proceedings of the Roundtable conference of the International Association of Statistics Education, ISI/IASE. https://doi.org/10.52041/iase24.301
Fergusson, A., & Wild, C. J. (2021). On traversing the data landscape: Introducing APIs to data‐science students. Teaching Statistics, 43, S71–S83. https://doi.org/10.1111/test.12266
Fleischer, Y., & Biehler, R. (2025). Exploring students’ constructions of data-based decision trees after an introductory teaching unit on machine learning. ZDM – Mathematics Education, 57(1), 153–173. https://doi.org/10.1007/s11858-025-01663-6
Fleischer, Y., Biehler, R., & Schulte, C. (2022). Teaching and learning data-driven machine learning with educationally designed Jupyter Notebooks. Statistics Education Research Journal, 21(2), 7. https://doi.org/10.52041/serj.v21i2.61
Fleisher, Y., Hüsing, S., Biehler, R., Podworny, S., & Schulte, C. (2022). Jupyter notebooks for teaching, learning, and doing data science. In S. A. Peters, L. Zapata-Cardona, F. Bonafini, & A. Fan (Eds.), Bridging the Gap: Empowering & educating today’s learners in statistics. Proceedings of the 11th International Conference on Teaching Statistics. ISI/IASE.
Fleischer, Y., Podworny, S., & Biehler, R. (2024). Teaching and learning to construct data-based decision trees using data cards as the first introduction to machine learning in middle school. Statistics Education Research Journal, 23(1), Article 3. https://doi.org/10.52041/serj.v23i1.450
Fleischer, Y. & Podworny, S. (2022). Teaching machine learning with decision trees in middle school using CODAP. In U.T. Jankvist, R. Elicer, A. Clark-Wilson, H.-G. Weigand, & M. Thomsen (Eds.), Proceedings of the 15th International Conference on Technology in Mathematics Teaching (ICTMT 15) (pp. 280–281). Aarhus University. https://doi.org/10.7146/aul.452
Gould, R. (2017). Data literacy is statistical literacy. Statistics Education Research Journal, 16(1), 22– 25. https://doi.org/10.52041/serj.v16i1.209
Hüsing, S. (2021). Epistemic Programming - An insight-driven programming concept for Data Science. In O. Sepällä & A. Petersen (Eds.), Proceedings of the 21st Koli Calling International Conference on Computing Education Research (pp. 1–3). ACM. https://doi.org/10.1145/3488042.3490510
Hüsing, S., & Podworny, S. (2021). Computational essays as an approach for reproducible data analysis in lower secondary school. In R. Helenius & E. Falck (Eds.), Statistics education in the era of data science. Proceedings of the Satellite conference of the International Association for Statistical Education, ISI/IASE.
Hüsing, S., Schulte, C., Sparmann, S., & Bolte, M. (2024). Using worked examples for engaging in epistemic programming projects. Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1 (pp. 443–449). ACM. https://doi.org/10.1145/3626252.3630961
Hüsing, S., Sparmann, S., Schulte, C., & Bolte, M. (2024). Identifying K-12 students’ approaches to using worked examples for epistemic programming. In M. Khamis, Y. Sugano, & L. Sidenmark (Eds.), Proceedings of the 2024 Symposium on Eye Tracking Research and Applications (pp. 1–7). ACM. https://doi.org/10.1145/3649902.3655094
Hüsing, S., Schulte, C., & Winkelnkemper, F. (2023). Epistemic programming. In S. Sentence, E. Barendsen, N. R. Howard, & C. Schulte (Eds.), Computer Science Education: Perspectives on teaching and learning in school. (pp. 291–304) Bloomsbury Academic. https://doi.org/10.5040/9781350296947.ch-022
Mike, K., & Hazzan, O. (2022). Machine learning for non-majors: A white box approach. Statistics Education Research Journal, 21(2), Article 10. https://doi.org/10.52041/serj.v21i2.45
Odden, T. O. B., Silvia, D. W., & Malthe-Sørenssen, A. (2023). Using computational essays to foster disciplinary epistemic agency in undergraduate science. Journal of Research in Science Teaching, 60(5), 937–977. https://doi.org/10.1002/tea.21821
Podworny, S., Fleischer, Y., Hüsing, S. (2022). Grade 6 students’ perception and use of data-based decision trees. In S. A. Peters, L. Zapata-Cardona, F. Bonafini, & A. Fan (Eds.), Bridging the Gap: Empowering & educating today’s learners in statistics. Proceedings of the 11th International Conference on Teaching Statistics. ISI/IASE.
Podworny, S., Fleischer, Y., Hüsing, S., Biehler, R., Frischemeier, D., Höper, L. & Schulte, C. (2021). Using data cards for teaching data based decision trees in middle school. In O. Sepällä & Petersen (Eds.), Proceedings of the 21st Koli Calling International Conference on Computing Education Research (pp. 1–3). ACM. https://doi.org/10.1145/3488042.3489966
Podworny S., Fleischer, Y., Stroop, D. & Biehler R. (2022). An example of rich, real and multivariate survey data for use in school. In J. Hogden, E. Geraniou, G. Bolondi, & F. Ferretti (Eds.), Proceedings of the Twelfth Congress of the European Society for Research in Mathematics Education (CERME12). ERME / Free University of Bozen-Bolzano. https://hal.science/hal- 03751842v1
Podworny, S., Hüsing, S. & Schulte, C. (2022). A place for data science introduction in school: between statistics and programming. Statistics Education Research Journal, 21(2), Article 6. https://doi.org/10.52041/serj.v21i2.46
Puloka, M. & Pfannkuch, M. (2023). What’s in a Pasifika name? Constructing a name dataset. Statistics and Data Science Educator. https://sdse.online/lessons/SDSE23-001/
Ridgway, J. (2015). Implications of the data revolution for statistics education. International Statistical Review, 84(3), 528–549. https://doi.org/10.1111/insr.12110
Ridgway, J. (2022). Statistics for empowerment and social engagement. Teaching statistics to develop informed citizens. Springer. https://doi.org/10.1007/978-3-031-20748-8