Enhancing data science learning through the use of images
DOI:
https://doi.org/10.52041/iase2023.112Abstract
Image analysis represents a crucial and dynamic field within the realm of data science and machine learning, enabling automated interpretation and analysis of images through statistical methods. Beyond its practical applications, such as image recognition, images offer a valuable pedagogical tool for teaching various multivariate analysis techniques, including cluster analysis, principal component analysis, and k nearest neighbors. By employing straightforward R code, images can be transformed into tidy data formats, primed for multivariate analysis. The resultant analysis outcomes can then be translated back into images, affording students the opportunity to visually comprehend the impact of techniques like cluster analysis, principal component analysis or k nearest neighbors. This approach effectively bridges the gap between abstract multivariate analysis concepts and concrete understanding, as students can visually perceive how cluster centroids or principal components reduce the complexity of the original (image) data.References
Boehm, F. J., & Bret M. Hanlon, B. M. (2021).What Is Happening on Twitter? A Framework for Student Research Projects With Tweets, Journal of Statistics and Data Science Education, 29, S95-S102.
Dogucu, M., Johnson, A. A., & Ott, M. (2023). Framework for Accessible and Inclusive Teaching Materials for Statistics and Data Science Courses, Journal of Statistics and Data Science Education.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning (2ed). New York: Springer.
Hsu, J. L., Jones, A., Lin, J.H., & Chen, Y.R. (2022). Data visualization in introductory business statistics to strengthen students' practical skills, Teaching Statistics, 44, 21-28.
Martin, M. A. (2003) “It's Like… You Know”: The Use of Analogies and Heuristics in Teaching Introductory Statistical Methods, Journal of Statistics Education, 11.
Mike, K., & Hazzan, O. (2022). Machine learning for non-majors: a white box approach. Statistics Education Research Journal, 21.
R Core Team (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
Ridgway, J. (2016). Implications of the Data Revolution for Statistics Education. International Statistical Review, 84, 528-549.
Schwab-McCoy, A., Baker, C. M., & Gasper, R. E. (2021). Data science in 2020: Computing, curricula, and challenges for the next 10 years. Journal of Statistics and Data Science Education, 29, S40-S50.
Urbanek S (2022). jpeg: Read and write JPEG images. R package version 0.1-10.
Wickham, H. (2014). Tidy Data. Journal of Statistical Software, 59, 1-23.