Comparing models of listening comprehension in Germany and Austria

Submitted by: Antonia Maria Bachinger
Abstract: L1 listening comprehension tests for the 3rd and 4th grade are part of the national educational assessment in Germany (“Vergleichsarbeiten”) and in Austria (“Standardüberprüfung”). The assessments in both countries pursue similar objectives and therefore share some common principles. They use a large number of listening comprehension tasks with several items each. To reduce individual student workload, each student works on a subset of items which results in a study design known as multiple matrix sampling design. In pilot studies, the sample size amounts to 150 students per item minimum. Hence, the size of the student sample for pilot studies depends on the total number of items. For national assessments which include multiple domains like listening comprehension, reading, and writing, total sample sizes vary between 25.000 (Germany) and 80.000 (Austria). The levels of difficulties of tasks and the competencies of students are described by means of specific competence models. These posit some hierarchically structured levels of listening comprehension.
However, there are substantial differences in the assessments between Germany and Austria with reference to test conditions, item construction, and underlying theoretical competence models. The consequences of these differences will be discussed. To name one example, German tests use rather complex listening tasks which are characterized by mostly literary stimulus texts of up to 10 minutes. Several items refer to the same stimulus material. Along with simple recognition tasks, open-constructed items allow for measurement of integrative understanding of the entire text. On the other hand, items which share the same stimulus sometimes violate assumptions of the underlying measurement models. In Austria, most stimulus texts are shorter and the corresponding items show more desirable measurement qualities. However, short texts restrict the possibilities to develop challenging tasks. This is especially true for items requiring advanced levels of analyzing and evaluating stimuli texts.
The comparison of listening comprehension assessment between Germany and Austria focuses on theoretical and methodological differences. Experiences and exemplary tasks from both countries will be shared to illustrate advantages and disadvantages. Further, possibilities to benefit from the experiences of each country for a mutual development of the respective tests will be discussed. From the perspective of comparability, equivalent conditions of testing, as well as comparable theoretical and statistical models, are essential to move towards a standardized assessment of listening comprehension in Germany and Austria.

Breit, S. /Bruneforth, M. /Schreiner, C. (2016): Bundesergebnisbericht. Standardüberprüfung 2015. Deutsch, 4. Schulstufe. Salzburg: BIFIE, p. 84-90.
Behrens, U./Böhme, K., /Krelle, M. (2009): Zuhören – Operationalisierung und fachdidaktische Implikationen. In: Bremerich-Vos, A./ Granzer, D./Köller, O. (Hrsg.), Bildungsstandards in Deutsch und Mathematik. Leistungsmessung in der Grundschule. Weinheim: Beltz, p. 357 – 375.