Amaya Garcia
Director, PreK-12 Research and Practice
Debra Ackerman is an expert in early childhood assessment at Educational Testing Services (ETS). In a new she takes a closer look at the Kindergarten Entry Assessments (KEAs) states are using to assess English learners (ELs). KEAs are usually administered in the first few months of kindergarten and help provide teachers with information on what students know and can do. Specifically, she compared the KEAs used in California, Delaware, Florida, Illinois, Mississippi, Oregon, Pennsylvania, Utah and Washington and examined whether they contained items that are specific to ELs, allow/mandate the use of linguistic accommodations, have policies on KEA assessor or observer linguistic capacity and are supported by research.
I reached out to her to learn more about the study and general considerations of using KEAs with English learners.
Q: What are the challenges of using KEAs with English learners?
One key challenge to generating accurate evidence of ELs鈥 knowledge and skills via a KEA is the language used during the assessment process. For example, an EL kindergartner may easily count 10 items when prompted to do so in Spanish, but not respond to the request when asked in English. To mitigate potential language issues, assessment policymakers can consider incorporating what are known as 鈥渓inguistic accommodations鈥 into the KEA process. For example, one accommodation is providing EL kindergartners with directions in his or her home language and English. Similarly, a second linguistic accommodation to consider is translating the specific item prompts to which kindergartners must respond when being assessed with a direct KEA. A third potential accommodation to be considered鈥攁nd one that can be used with both direct and observational KEAs鈥攊s allowing students to use the language or languages in which they are most proficient to demonstrate their knowledge and skills.
A second 鈥 and related 鈥 KEA challenge is teachers鈥 capacity to implement these linguistic accommodations in a valid and reliable way. In fact, the National Association for the Education of Young Children urges young ELs to be assessed by staff who are not only fluent in a child鈥檚 home language but also familiar with preferred cultural interaction styles. A similar cultural background may be particularly essential when assessing young children鈥檚 social-emotional development. Adults who assess young ELs also may need a thorough understanding of bilingual language acquisition so that they can distinguish between inadequate content knowledge and a student鈥檚 lack of English language or cultural proficiency.
Q: You created a less-to-more continuum to highlight differences in state KEAs. How did you create that continuum?
I set up a rubric trying to think about, 鈥極k if you did want to have a measure that has a greater likelihood of being useful for providing teachers with information about the English learner kindergarten, what would a measure need to have鈥?
And so that鈥檚 why I focused on these issues of: are there any items that are [specific to] English learners; are linguistic accommodations allowed to be used while assessing ELs; what does the state policy say about the linguistic capacity of whoever is serving as the assessor or observer; and, more importantly, what is the research base on these measures? That鈥檚 how I came up with this continuum of looking at these measure and where did they fall on this sort of continuum of having less of those things versus having more of those things. And that鈥檚 also where California and Illinois ended up on the right hand side seeming having all of those things, versus some measures in the middle and those on the left hand side that had very little, if anything.
Q: California and Illinois really stood out in your study, why is that?
California and Illinois have specific KEA policies aimed at supporting the validity and reliability of the evidence generated for ELs. For example, the observational measure used in both states contains a 4-item English language development domain focusing on expressive and receptive vocabulary and early literacy skills. The scoring rubric for this domain reflects the developmental span of young children鈥檚 second language acquisition and use as well. In addition, the KEA used allows children to use whichever language(s) they speak to display their skills and knowledge for all of the measure鈥檚 remaining items. California and Illinois also have articulated policies regarding observers鈥 linguistic capacity when using the KEA with ELs. Finally, the various iterations of the measure have undergone a variety of EL-relevant test validity and observer reliability studies.
Q: Then you have states like Washington and Delaware that fall in the middle by using KEAs that include only some indicators related to ELs.
I don鈥檛 really talk about this a lot in the report, but they鈥檙e both using state customized versions of Teaching Strategies GOLD. However, their versions are not identical so they鈥檙e including different items and their policies are different for which items students can use their home language to show what they know and can do. So, at first glance it might look like they are using the same KEA, but they鈥檙e not really because it鈥檚 different items and different policies for ELs. And of course you have a lot of research on the full GOLD measure, but not on the state customized version.
Q: The remaining states, those on the left side of the continuum, had no indicators related to ELs. This included Florida, which surprised me because they have a lot of ELs in the state. Tell me more about those findings.
That was interesting too because I looked at states not only in the percentage of young ELs but also the English language proficiency standards that they rely on and you would think that in the states that have higher percentage or even the same standards would somehow be aligned on this rubric, but they鈥檙e not, as you noticed.
Another interesting point was that Florida and Mississippi their KEA is the STAR Early Literacy measure. It鈥檚 a computer administered test and the developers themselves say that English learners鈥 results need to be interpreted cautiously. They even produce a Spanish version of the measure yet the states…aren鈥檛 using the Spanish version.
Q: What are the key implications of your findings for policymakers and those who make assessment decisions at the state level?
The key implication is that policymakers and others who make assessment decisions need to do their homework. They need to consider what the purpose of the assessment is, what is the population to be assessed, and whether the measure chosen can actually give us the information we seek. I don鈥檛 know to what extent there are policymakers that would simply select an assessment because that鈥檚 what鈥檚 being used somewhere else. I would think that most education policymakers probably wouldn鈥檛 make such a simplistic decision but just based on looking at these KEAs it sort of hammers home the message that you really need to do your homework. Even if someone else is using a KEA, that doesn鈥檛 mean that it is the right assessment for you. That all ties back to this frame of reliability and validity. Tests by themselves are not valid or invalid. What is valid is will the test provide you with evidence for the purpose that you need it for and for the population being assessed?
Q: From your perspective, are KEAs being used in a way that can actually produce useful information for teachers about their ELs?
Oh, that is a whole other issue! That would be overstepping to say that is what my study suggests. You have now at least 40 states in the process of developing, implementing, and bringing up to scale these KEAs, but we need much more research. And not only psychometric research on is this a good test to use with ELs, but are kindergarten teachers even finding these data useful to informing their practice? The report that I did prior to this one did not focus specifically on ELs, but was about the process of developing tests and the real world compromises that need to be made as the tests were being designed and tried out. We do know that teachers would say things like these tests are too long, there are too many items, I am losing out on instruction time. And so they would pick and choose which items they would even administer. As a result some states have had to scale back on the length of some of the observational rubrics. It鈥檚 really best left to be determined to what extent are teachers taking the data to inform their instruction and, of course, what impact is that having on children鈥檚 outcomes.