No Test Takin' Without Contextualization: How The Federal Government Can Increase The Value Of This Year’s Student Assessments

Joseph Hood

Feb. 9, 2021

Few disagreed when Secretary DeVos waived federal requirements for annual student academic testing after COVID-19 arrived in the U.S. in Spring 2020. But the U.S. education community is now deeply divided over whether students should have to take assessments in Spring 2021. Most parents, as well as some academics and educators (53 percent according to a recent Educators for Excellence survey) are against standardized testing this year. They argue that, at worst, the test results will be unreliable, or at best, demonstrate what we largely already know intuitively—students have learned less during the COVID pandemic.

Other academics, educators, and state and federal education leaders—as well as most of the leading national civil rights and education advocacy organizations—contend that statewide testing is necessary to better understand the full impact of COVID. In particular, they argue that two consecutive years without testing would be a damaging lapse in data collection for understanding the state of education for our most underprivileged communities. It appears that most states are following this line of thinking. According to recent analysis from the Data Quality Campaign, only five states—Georgia, Michigan, Montana, New York, and South Carolina—have requested waivers from the U.S. Department of Education (ED) to skip assessment altogether this year, although this could change as the Biden administration recently extended the waiver request deadline.

Rather than focusing on whether or not testing should occur, other groups have offered excellent proposals on how to best test students. Assessment recommendations published by the Center for Assessment and the Aspen Institute in October 2020 included reducing test length to focus on prioritized standards; testing students in-person (given adequate safety protocols are followed); and expecting lower than usual test participation rates and contextualizing results accordingly. They also recommend testing a representative sample of students rather than testing every child, and specifically highlight the matrix sampling approach used for the National Assessment of Educational Progress (NAEP) as a good model for states to follow. The Council of Chief State School Officers (CCSSO) offered a similar proposal with two additions: expanding testing windows and implementing grade-band testing. The latter would reduce the number of students states would need to test, minimizing infection risks while still allowing states to collect more representative data.

However, these groups levied two weighty conditions for most of their proposed modifications to be viable options:

States needed to start planning for adaptations to assessments in fall 2020.
States should test only if the assessment could be administered in a reliable and valid manner.

The fact that the testing debate is reigniting less than two months from when most state testing would typically begin makes it unlikely that the first condition can be met, unless modifications are in the works already.

And if states treat statewide summative assessments the same as any other year, it’s hard to believe that they can meet the second criterion that test administration be reliable and valid. This is because reliable and valid statewide summative assessment administration rests on several assumptions beyond the quality of the assessment itself: that schools across the state have delivered instruction in relatively similar ways to each other (they haven’t); that students have generally received the same quality of instruction (also, no), and that the exams will be delivered in comparable testing environments (the possibility that some assessments could be delivered remotely makes this impossible).

Perhaps most concerning is the real possibility that any assessment will underrepresent the student populations most affected by the pandemic, thousands of whom are unaccounted for. Researchers who examined student results in the fall 2020 NWEA MAP Assessment, one of the few nationwide tests administered since the pandemic began, noted that “[m]issingness matters;” 15 percent of students tested in fall 2019 were nowhere to be found in their fall 2020 comparative data, and “a larger fraction of attriters were ethnic/racial minority students, students with lower achievement in fall 2019, and students in schools with higher concentrations of socioeconomically-disadvantaged students.” In short, the students whom we most need to understand will be the least understood. If such underrepresentation is repeated in spring 2021 statewide assessments, the degree of learning loss will potentially be underestimated, and targeted distribution of resources to address learning loss will not reflect true need. Attempting to directly compare overall assessment results from previous years is imprudent for the same reasons.

It’s time to accept that the data states have already planned to collect will be the data that we have to work with, like it or not. The debate now needs to pivot from whether or not we should test—or even how we should test—toward how we can most accurately interpret and use the testing data we’re going to get.

ED’s current assessment guidance, which primarily outlines the hoops states have to jump through if they want to make any changes to how assessment results are used for school accountability purposes, as well as the changes that they are and are not authorized to make, is not enough.

Information that contextualizes spring 2021 assessment scores will be essential to helping researchers understand the conditions that led to student performance. During COVID, this includes information on access to remote learning tools like broadband internet and digital devices, length of (or absence of) instruction, and type and amount of instruction provided (for example, proportion of remote to in-person, synchronous to asynchronous, accessible resources, etc).

The recognition that understanding the educational “inputs” students receive can be just as important as (and in the 2020-21 school year, probably more important than) understanding “outputs” such as test scores is not new. As highlighted by the Center for Assessment, researchers and education leaders have been collecting and analyzing Opportunity to Learn (OTL) data for more than 50 years, enabling them to get beyond “what” the data show into “why” those data came to be (i.e., the underlying conditions). Understanding the environment students have been learning in since spring 2020 will allow policymakers to understand not just which schools’ students were most affected, but where resources can best be targeted to address those effects.

An encouraging first step for contextualized data collection came on Friday, when ED announced their “NAEP 2021 School Survey.” The project, initiated by a recent executive order from President Biden, plans to collect nationally and state-representative data from approximately 7,000 schools on COVID-19’s impact on students and the status of school reopenings. Government researchers at the Institute of Education Sciences will collect data on:

“The share of the nation's schools that are open with full-time in-person instruction, open with online and in-person instruction, or fully remote.
Enrollment by instructional mode by race/ethnicity, socio-economic status, English learner status, and disability status.
Attendance rates by instructional mode by race/ethnicity, socio-economic status, English learner status, disability status, and housing status.
Frequency of in-person learning for students.
Average number of hours of synchronous instruction for students in remote instruction mode. And,
Student groups prioritized by schools for in-person instruction by selected school characteristics.”

The Biden administration’s efforts to collect these data is a stellar example of how to contextualize student experiences during the pandemic, particularly its emphasis on disaggregating information by student characteristics that can put a spotlight on how COVID likely exacerbated pre-existing educational inequities.

But the Biden administration and Congress could do even more to incentivize quality collection and interpretation of summative assessment results by:

Providing a blanket waiver for states to decouple assessment results from accountability in the 2020-21 school year, but granting very few summative testing waivers to states requesting them. Full waivers from testing would be contingent on the state proving that the collection of any data would be unsafe (as Secretary of Education nominee Dr. Cardona indicated in his confirmation hearing).
Issuing additional non-regulatory guidance from ED focused on how to best collect and interpret statewide assessment data for this school year. In addition to specifying that testing conditions be captured alongside each student’s assessment (e.g., in-school versus remote assessment administration, who is proctoring, etc.), such guidance should outline proven approaches for weighting undercounted populations, as well as methods for best collecting OTL data, and using it in conjunction with assessment results to create a fuller picture of students’ educational experiences in 2020-21. The data should be disaggregated by demographic subgroup, and ensure a sufficient sample of the various student populations most likely to be affected by COVID, as a coalition of organizations recommended in an open letter to Dr. Cardona. Guidance should also indicate that the primary goal of collecting data this year is to ensure the requisite resources are made available to meet students’ full spectrum of needs in the coming school year, not to compare schools or districts to each other in punitive ways. To prevent such comparisons, published data should be deidentified from school and district.
Updating the $1.9 trillion American Rescue Plan (ARP) so that ARP funding used for state testing must also go to collecting and reporting contextualized data on student learning environments. States that used federal funds for summative student assessments would be required to publicly report collected OTL information alongside any reporting of summative assessment results, aligning inputs with outputs. The funds could also be leveraged to finance collecting more locally-contextualized data (a resource-intensive process), and efforts to safely assess as many students as possible.

If the American Rescue Plan passes at its requested funding level, the $130 billion dollars earmarked for K-12 education will be the single largest federal investment in K-12 education in American history. These needed funds will help get students back to school safely, and start to address the learning loss and social-emotional trauma that COVID has wrought. It is imperative that states and districts access the best data possible to equitably distribute these funds to meet the moment, data that the federal government is committing to collect. Further federal guidance and action on making sure that student assessments are conducted and interpreted in ways that are useful, and not harmful, can help ensure that dollars go where they’re most needed.

Enjoy what you read? Subscribe to our newsletter to receive updates on what’s new in Education Policy!

No Test Takin' Without Contextualization: How The Federal Government Can Increase The Value Of This Year’s Student Assessments

Blog Post

bibiphoto / Shutterstock.com

Joseph Hood

Feb. 9, 2021