Not Too Big to Fail: Big Data in Higher Education

Weekly Article
Shutterstock
Nov. 3, 2016

Open educational resources have replaced many large, expensive textbooks. Online education has stepped in to provide an alternative to traditional brick-and-mortar colleges with their leafy quads. And faculty have all but left chalkboards behind with the advent of new and emerging digital tools. But colleges aren't stopping innovation there. Many are using predictive analytics to educate and support students.

Predictive analytics—using years of historical data about hundreds of thousands of students to predict future events—are making it easier and faster for colleges to decide which students to enroll and how to get them to graduation. Big data—huge amounts of information stored in one place—bolsters those approaches.

But big data decision-making may not be as innovative as imagined. That’s because it can contribute to discriminatory practices and institutional opacity, while also jeopardizing individuals’ privacy.

In and of itself, big data is not inherently bad. For example, it can identify students who need extra support, steer students in courses they will do well in, and provide digital tools that can customize the learning process for individual students. However, as Iris Palmer and I explore in our recently released paper, The Promise and Peril of Predictive Analytics in Higher Education: A Landscape Analysis, big data—which is still developed and analyzed by inherently biased human beings—raises ethical concerns.

Even the best educational implementations of big data are complicated. Consider, for example, the case of Austin Peay State University (APSU), a 4-year public university in Tennessee. APSU has a course recommender system that draws inspiration from Netflix, Amazon, and Pandora to match students with the courses and majors in which they are most likely to do well. It even predicts what grade students will receive. The tool does this by comparing over 100,000 past students’ grades with the transcripts of currently enrolled students. Now, more students are passing their classes including low-income students. Three other universities in Tennessee have adopted the system, helping reach over 40,000 students.

In many ways, then, this is a successful system. It is not, however, consequence free. Because recommender systems have no choice but to rely on past data to make predictions—which often include data like a student’s race, socioeconomic status, and gender—they can generate recommendations based on these factors. So if a college’s past data shows that minorities and low-income students don’t do well in math or science, recommender systems could suggest these students try English or history—even if they are aspiring mathematicians. Austin Peay’s system is less problematic because it doesn’t use students’ race, income, or gender to make predictions about what grade students will get—but many other systems do include such data.  

There are other issues with colleges using big data. Often, it is not clear to those on campus why big data tools are used, how they were built (typically with private companies), or their limitations. For example, does a student predicted to get a C in Introduction to Engineering always mean they will get a C in subsequent Engineering classes? And, how should a college convene key staff to decide whether it should use a big data tool—and which tool in particular?

With tools capable of determining a student’s academic future, transparency should be front and center, never pushed into the background. And big data unfortunately also makes it harder to protect people’s privacy. As colleges collect more data to help students and colleges meet their goals, it may become less and less of a priority for colleges to make sure students and staff are aware of the many ways data on them is collected and mined for new information. Institutions often assume that upon enrolling or accepting employment, students and staff consent to analysis of their data. Since this may not necessarily be the case, a college risk students and staff feeling their privacy is infringed upon.

Colleges should be using technology to innovate—this is not in question. But how they’re using it, what the consequences of it may be, and how it impacts students of all backgrounds—that should be. We need to not only imagine higher education as innovative; we also need to interrogate what the imagined looks like in reality.