Ensure Transparent Use of Data

data transparency illustration
Shutterstock

One of the biggest frustrations for colleges in choosing a predictive analytics vendor is trying to peer into the companies’ algorithms, which are all too often a black box. Administrators want to know how the system comes up with its predictions but do not know which answers are reasonable to expect. Confusion over what vendors should be able to explain about the algorithm is particularly true for administrators who do not come from a technical background when they talk to salespeople who also do not entirely understand the system. The information asymmetry between the technical staff at the vendor and college can lead to misunderstandings and bad decisions. Colleges have an ethical obligation to know a certain amount about how the algorithms they are applying to their students work to ensure some accountability. Here are the things colleges should insist on knowing about the vendors’ algorithms:

Ask about Data Used in the Prediction

Ask the vendor to provide a list of all the data elements it uses in the predictions. With this information, the college can decide the sensitivity of the data elements and document where that data is stored. And the documentation of data elements also provides an important baseline for what the predictions are based on. Vendors should commit to providing the school with an updated list of variables they use and alert the institution if they change.

Return to Top

Ask About Training Data

Colleges should ask vendors if they use data from other colleges to inform their models. Some vendors use data from across multiple schools to make their models more robust. But it is important to note that one way bias creeps into algorithms is through non-representative training data. Ask vendors to document the diversity of the types of data, the colleges, and the students that helped to train their models. Colleges should also ask how vendors plan to customize the algorithm to their specific school. Many vendors create the model with only the college’s data. In that case, administrators should ask how far back the training data goes. It is important to balance having the data be as recent as possible to reflect the current college population with ensuring there is a valid sample size to train the algorithm. College administrators should also ensure that the makeup of the college population has not changed significantly over that time.

Return to Top

Ask How Effective the Model is

Colleges should know that overall accuracy rates can be misleading and easily manipulated. Instead, colleges should ask vendors for an AUC (area under the curve)/C statistic. A C statistic indicates how accurate the model is. For most models, colleges should be looking for a C statistic of .75 or greater.1 But watch out. If the C statistic is .99 or higher, there is most likely a problem with the model like leakage (where the data used to train the algorithm includes the data you are trying to predict) or overfitting (where the model is so tuned to the sample data that it fails to generalize to the larger population). Colleges can also ask for a Brier Score.2 A Brier Score is a way to measure the accuracy of a probability forecast, like a weather forecaster saying there is an 80 percent chance of rain. On this scale a lower score is better. Anything under .22 is a solid model. Administrators should also bring colleagues with technical experience into these conversations. Vendors should also commit to presenting on these measures and the overall results of the model at regular intervals, such as every time they rebuild the model.

Return to Top

Ask Vendors to Test their Algorithms

Some vendors will agree to conduct a pilot to test their algorithm with the college’s data. This can help indicate if the system is discriminatory by overidentifying as “at risk” certain groups of students on campus or if it identifies all of a college’s students as “at risk.” A pilot project can also help highlight where the integration of the institution’s data may need to be improved. Before signing a long-term contract with vendors, college administrators should ask if they are willing to conduct a pilot. Vendors may not be willing to conduct a pilot because they may need to do all the work with the data up front to show what they are capable of. In that case, administrators should ask for a de-identified example of analysis they did for a comparable school.

Return to Top

Ask for a Disparate Impact Analysis

Vendors need to agree that they will conduct a disparate impact analysis of their tool output after getting the contract. If a vendor does over identify a certain group as “at risk”, that may be because we live in an imperfect world where certain groups are actually more at risk of failing. With the knowledge that a group of students is being overidentified, the college should come up with a plan to focus support on that population. The vendor also needs to ensure that its algorithm is useful in the college’s particular context. For example, is the algorithm identifying every student as “at risk” and if so, how can the vendor ensure this will be fixed?

Return to Top

Tools for Evaluating the Algorithm

Aequitas: An open source bias audit toolkit to audit machine learning models for discrimination and bias, and make informed decisions around developing and deploying predictive risk-assessment tools.

Algorithmic Impact Assessments: A Practical Framework for Public Agency Accountability. See page 15 for how to conduct a disparate impact analysis.

Ask for the Factors that Contribute to the Prediction

It can be a tough sell to convince staff members who have worked with students for years to trust the predictions of an algorithm. This is particularly true if there is no transparency into why it is making its prediction. Having vendors list the most weighted factors that went into the prediction for a particular student on the dashboard or interface can bring transparency that helps users act on the prediction. However, it is also important to note that these factors are not causal. For instance, if one reason for the prediction is that the student did not enroll in very many credits that is not necessarily the reason the student is at risk of dropping out. It may just be an indication of something else going on in the student’s life.

Return to Top

Ask How Often the Algorithm is Refined

As we noted in Predictive Analytics in Higher Education: Five Guiding Practices for Ethical Use, what colleges learn from their use of predictive analytics should be used to refine the algorithm. There should be a cycle of improvement. This can be difficult if the vendor does not support this type of systematic refinement. Models should generally be rebuilt at least once a year and as often as once a term. More frequently than once a term should not be necessary if it is a solid model and more than a year risks degradation of the model.

Return to Top

Citations
  1. Statistics How To (website), “Brier Score: Definition, Examples,” source.
  2. See the first issue in the checklist in Texas State University, “Key Issues in Contracting for Information Technology Resources and Services,” Instructional Technologies Support, source.

Table of Contents

Close