It may not feel that way outside, but the U.S. Department of Education made it with one day to spare in releasing more details of its college ratings plan this fall.
The 17-page “framework” and accompanying blog post is surely to be met with disappointment from those expecting a list of schools for naming and shaming. Or even enough data to start generating theoretical lists (though I’m sure institutional researchers across the country are being asked to try and do so anyway).
The document released today does present some forward progress in thinking. There’s some greater structure given to initial performance levels and college groupings. Some measures are more concretely laid out. Others have been taken off the table. These are all steps that need to happen.
But making those choices exposes what happens when you take a complicated data-oriented process and run it through the complexities of transparency and justification that the federal government must do (a luxury private ratings can avoid). Pair that with the need for a communications strategy and you end up with some useful elements of the sausage-making process encased in talking points and framing.
Two Groups, Three Tiers
So let’s ignore the framing and talk about what’s really in here. At a basic level, two main choices were made. Colleges would be placed into two groups: those that primarily award four-year degrees and those that primarily award two-year degrees. Colleges that don’t award degrees (such as cosmetology schools) would not be rated and instead picked up through gainful employment. That’s a smart exclusion, since certificates can inflate graduation rates.
Schools would then be grouped into three tiers—low-performing, high-performing, and the middle, with the bulk of schools ending up in that final category. The Department really needs a better name for that middle tier—how about “acceptable”—but it’s good to see a more simplistic grouping.
From the start the Department has talked about the ratings system being based on access, affordability, and outcomes. But those are concepts, not actual measures. The framework provides a little more detail about what some of these would be.
A number of suggested metrics are what one would expect: The percentage of students receiving Pell Grants, net price by income, transfer rates, and some form of completion rate. Others are new, but pretty easy to understand, such as graduate school enrollment or the percentage of students coming from different income quintiles. The most unfamiliar one is an “EFC gap,” which would be another way of getting at socieoeconomimc diversity by looking at the percentage of students whose expected family contributions was below a certain level.
Earnings and completion in particular warrant a greater discussion.
Earnings are probably the most controversial element in the ratings system and the Department’s proposal appears to seek a middle ground on how to use them. It suggests a measure of “substantial employment” for judging success a few years after college that would look at the percentage of students whose incomes exceed some basic test—like being above 200 percent of the poverty level ($23,340) or minimum wage ($15,080). Mean and median earnings of graduates would eventually be used, but as a measure of long-term performance in the labor market, such as after 10 years.
Completion rates are also tough, but due to data quality concerns. The Department notes that the existing federal graduation rate is limited to results for students attending college for the first time as full-time students. Here the Department’s timeframe works against it. Improved graduation rate data will come online starting in 2017, but will take several years after that to start judging performance trends. It also still will not be a sufficient replacement for the much-needed federal student unit record system.
The Department’s suggested alternative is to generate completion rates from the National Student Loan Data System (NSLDS), which has individual level information when students receive federal aid. New America participated in a technical panel convened by the National Center for Education Statistics to explore this idea earlier this year, but it was generally rejected because there would not be a good way to create meaningful cohorts and would exclude students who didn’t receive federal aid.
The greatest metrics surprises are within the consideration of federal student aid. The Department rejected the idea of looking at student indebtedness levels—one of the few measures on the College Scorecard. And the existing cohort default rate gets a tepid mention. Instead, the Department suggests looking at some form of a loan repayment rate, including some versions similar to what we’ve argued for in the past.
Finally, the Department explicitly discarded some measures for being infeasible. These include: learning outcomes; indicators of public service and civic engagement; placement, licensure, and certification rates; measures of noncognitive skills; and student satisfaction.
Lots of Options, but Missing the Most Important One
The rest of the document shows where the Department still hasn’t been able to sort out thorny issues. This includes things like how to reflect improvement, adjust for institutional characteristics, and whether colleges should be rated separately on each measure or one combined result. Many of these questions have been asked about the process from the start.
These different choices show why federal ratings are extremely difficult to construct. The Department has an obligation to transparency and thoughtfulness in listening to responses. But it also has to make choices. Failing to do so results in a never-ending firehose of options that get repeated ad nauseam.
In making those choices, the Department needs to recognize and own that no system is going to be perfect, some people will be unhappy, and it’s inevitable that the ratings will not capture every last nuance and element. Those aren’t deal breakers, they’re the natural result of any measurement or accountability system. What’s important is recognizing those limitations and not trying to blow past them.
Fortunately, there’s one simple choice that could alleviate many of these problems: limiting results to students getting federal aid. NSLDS has fairly robust data on students getting aid and fewer limitations for completion and earnings that apply when trying to measure all students. Concerns about the effectiveness of student aid dollars are also the clearest hook for justifying the need for a federal ratings system. And limiting results to students getting federal aid would enhance, rather than detract from the consideration of what happens to the lowest-income students. That’s the type of unique federal role that would make a ratings system worthwhile.