Interview: New Study Takes a Closer Look at Kindergarten Entry Assessments for ELs

Blog Post
Nov. 1, 2018

Debra Ackerman is an expert in early childhood assessment at Educational Testing Services (ETS). In a new article, she takes a closer look at the Kindergarten Entry Assessments (KEAs) states are using to assess English learners (ELs). KEAs are usually administered in the first few months of kindergarten and help provide teachers with information on what students know and can do. Specifically, she compared the KEAs used in California, Delaware, Florida, Illinois, Mississippi, Oregon, Pennsylvania, Utah and Washington and examined whether they contained items that are specific to ELs, allow/mandate the use of linguistic accommodations, have policies on KEA assessor or observer linguistic capacity and are supported by research.

I reached out to her to learn more about the study and general considerations of using KEAs with English learners.

Q: What are the challenges of using KEAs with English learners?

One key challenge to generating accurate evidence of ELs’ knowledge and skills via a KEA is the language used during the assessment process. For example, an EL kindergartner may easily count 10 items when prompted to do so in Spanish, but not respond to the request when asked in English. To mitigate potential language issues, assessment policymakers can consider incorporating what are known as “linguistic accommodations” into the KEA process. For example, one accommodation is providing EL kindergartners with directions in his or her home language and English. Similarly, a second linguistic accommodation to consider is translating the specific item prompts to which kindergartners must respond when being assessed with a direct KEA. A third potential accommodation to be considered—and one that can be used with both direct and observational KEAs—is allowing students to use the language or languages in which they are most proficient to demonstrate their knowledge and skills.

A second – and related – KEA challenge is teachers’ capacity to implement these linguistic accommodations in a valid and reliable way. In fact, the National Association for the Education of Young Children urges young ELs to be assessed by staff who are not only fluent in a child’s home language but also familiar with preferred cultural interaction styles. A similar cultural background may be particularly essential when assessing young children’s social-emotional development. Adults who assess young ELs also may need a thorough understanding of bilingual language acquisition so that they can distinguish between inadequate content knowledge and a student’s lack of English language or cultural proficiency.

Q: You created a less-to-more continuum to highlight differences in state KEAs. How did you create that continuum?

I set up a rubric trying to think about, ‘Ok if you did want to have a measure that has a greater likelihood of being useful for providing teachers with information about the English learner kindergarten, what would a measure need to have’?

And so that’s why I focused on these issues of: are there any items that are [specific to] English learners; are linguistic accommodations allowed to be used while assessing ELs; what does the state policy say about the linguistic capacity of whoever is serving as the assessor or observer; and, more importantly, what is the research base on these measures? That’s how I came up with this continuum of looking at these measure and where did they fall on this sort of continuum of having less of those things versus having more of those things. And that’s also where California and Illinois ended up on the right hand side seeming having all of those things, versus some measures in the middle and those on the left hand side that had very little, if anything.

Screen Shot 2018-10-24 at 12.38.57 PM.png

Q: California and Illinois really stood out in your study, why is that?

California and Illinois have specific KEA policies aimed at supporting the validity and reliability of the evidence generated for ELs. For example, the observational measure used in both states contains a 4-item English language development domain focusing on expressive and receptive vocabulary and early literacy skills. The scoring rubric for this domain reflects the developmental span of young children’s second language acquisition and use as well. In addition, the KEA used allows children to use whichever language(s) they speak to display their skills and knowledge for all of the measure’s remaining items. California and Illinois also have articulated policies regarding observers’ linguistic capacity when using the KEA with ELs. Finally, the various iterations of the measure have undergone a variety of EL-relevant test validity and observer reliability studies.

Q: Then you have states like Washington and Delaware that fall in the middle by using KEAs that include only some indicators related to ELs.

I don’t really talk about this a lot in the report, but they’re both using state customized versions of Teaching Strategies GOLD. However, their versions are not identical so they’re including different items and their policies are different for which items students can use their home language to show what they know and can do. So, at first glance it might look like they are using the same KEA, but they’re not really because it’s different items and different policies for ELs. And of course you have a lot of research on the full GOLD measure, but not on the state customized version.

Q: The remaining states, those on the left side of the continuum, had no indicators related to ELs. This included Florida, which surprised me because they have a lot of ELs in the state. Tell me more about those findings.

That was interesting too because I looked at states not only in the percentage of young ELs but also the English language proficiency standards that they rely on and you would think that in the states that have higher percentage or even the same standards would somehow be aligned on this rubric, but they’re not, as you noticed.

Another interesting point was that Florida and Mississippi their KEA is the STAR Early Literacy measure. It’s a computer administered test and the developers themselves say that English learners’ results need to be interpreted cautiously. They even produce a Spanish version of the measure yet the states...aren’t using the Spanish version.

Q: What are the key implications of your findings for policymakers and those who make assessment decisions at the state level?

The key implication is that policymakers and others who make assessment decisions need to do their homework. They need to consider what the purpose of the assessment is, what is the population to be assessed, and whether the measure chosen can actually give us the information we seek. I don’t know to what extent there are policymakers that would simply select an assessment because that’s what’s being used somewhere else. I would think that most education policymakers probably wouldn’t make such a simplistic decision but just based on looking at these KEAs it sort of hammers home the message that you really need to do your homework. Even if someone else is using a KEA, that doesn’t mean that it is the right assessment for you. That all ties back to this frame of reliability and validity. Tests by themselves are not valid or invalid. What is valid is will the test provide you with evidence for the purpose that you need it for and for the population being assessed?

Q: From your perspective, are KEAs being used in a way that can actually produce useful information for teachers about their ELs?

Oh, that is a whole other issue! That would be overstepping to say that is what my study suggests. You have now at least 40 states in the process of developing, implementing, and bringing up to scale these KEAs, but we need much more research. And not only psychometric research on is this a good test to use with ELs, but are kindergarten teachers even finding these data useful to informing their practice? The report that I did prior to this one did not focus specifically on ELs, but was about the process of developing tests and the real world compromises that need to be made as the tests were being designed and tried out. We do know that teachers would say things like these tests are too long, there are too many items, I am losing out on instruction time. And so they would pick and choose which items they would even administer. As a result some states have had to scale back on the length of some of the observational rubrics. It’s really best left to be determined to what extent are teachers taking the data to inform their instruction and, of course, what impact is that having on children’s outcomes.

Related Topics
Accountability, Assessment, and Data English Learners