The Language Environment of Children’s Picture Books: A Lexical Study Using Data Science Methods.

Submitted by: Clarence Green
Abstract: Abstract
This presentation describes the development of a novel corpus of children’s picture books using innovative data science methods (Green et al., 2023). Children’s language and conceptual development are enhanced by language environments that include children’s picture books (Pendergast, & Garvis, 2023). Therefore, it is important to better characterize this input and this study explores the vocabulary input from this initial print environment (Wasik et al., 2016). Previous research has been restricted by methodological limitations precluding the development of large corpora. The study applies data science methods to a build larger corpus model than previously possible and investigates the lexical profile of over 2000 narrative and information picture books. The corpus is built from digital sources of books being read aloud online. This method provides researchers access to larger pools of data than previously possible. The study explores informational and narrative picture books in terms of high-frequency vocabulary and the print environments lexical diversity, morphology, academic vocabulary, and semantic profile. Models are developed to estimate the additional word-type exposure in L1 and EAL language environments including (or lacking) English-language picture books, indicating that picture book exposure changes the language environment of children in ways important to reading development in multilingual classrooms by providing exposure to varied and different semantic environments compared to models of child-directed speech. Additional findings include that picture books provide exposure to EL academic vocabulary (Hiebert, 2020). Computational models indicate that book reading once every day or second day over a year might boost unique-word exposure approximately 10% for some language environments. The corpus has been developed in the spirit of open science to share with other researchers.

References
Green, C., Keogh, K., Sun, H., & O’Brien, B. (2023). The Children’s Picture Books Lexicon (CPB-LEX): A large-scale lexical database from children’s picture books. Behavior Research Methods, 1-18.
Hiebert, E. H. (2020). The core vocabulary: The foundation of proficient comprehension. The Reading Teacher, 1-12.
Wasik, B. A., Hindman, A. H., & Snell, E. K. (2016). Book reading and vocabulary development: A systematic review. Early Childhood Research Quarterly, 37, 39-57.
Pendergast, D., & Garvis, S. (Eds.). (2023). Teaching Early Years: Curriculum, Pedagogy, and Assessment. Taylor & Francis.