Organizations are increasingly using machine learning models to allocate narrow resources or opportunities. For example, such models can lend a hand companies sift through CVs to select candidates for interviews or lend a hand hospitals classify kidney transplant patients based on their likelihood of survival.
When implementing a model, users typically try to ensure that its predictions are fair by reducing bias. This often involves techniques such as adjusting the features the model uses to make decisions or calibrating the outputs it produces.
But researchers from MIT and Northeastern University argue that these fairness methods are not sufficient to address structural injustices and inherent uncertainties. In new papershow how randomizing the model’s decisions in a structured way can improve fairness in certain situations.
For example, if many companies operate the same machine learning model to deterministically rank interview candidates—without any randomness—then one deserving person might be the lowest-rated candidate for each position, perhaps because of the way the model weights responses to an online form. Introducing randomness into the model’s decisions could prevent one deserving person or group from always being denied a sparse resource like a job interview.
Through their analysis, the researchers found that randomization can be particularly beneficial when the model’s decisions are subject to uncertainty or when the same group consistently receives negative decisions.
They present a framework that can be used to introduce a certain amount of randomness into model decisions by allocating resources using a weighted lottery. This method, which an entity can adapt to its situation, can improve fairness without compromising the model’s performance or accuracy.
“Even if fair predictions could be made, should these social allocations of scarce resources or opportunities be decided solely on the basis of scores or rankings? As things scale and we see more and more options being determined by these algorithms, the inherent uncertainties in these outcomes can become more severe. We show that fairness may require some form of randomization,” says Shomik Jain, a graduate student at the Institute for Data, Systems, and Society (IDSS) and lead author of the paper.
Jain is joined on the paper by Kathleen Creel, assistant professor of philosophy and computer science at Northeastern University; and senior author Ashia Wilson, Lister Brothers Professor of Career Development in the Department of Electrical Engineering and Computer Science and principal investigator in the Laboratory for Information and Decision Systems (LIDS). The research will be presented at the International Conference on Machine Learning.
Claims processing
This work is based on previous article in which researchers investigated the harm that can occur when one uses deterministic systems at scale. They found that using a machine learning model to deterministically allocate resources can amplify inequalities present in the training data, which can amplify bias and systemic inequalities.
“Randomization is a very useful concept in statistics, and we are pleased that it meets the requirements of fairness from both a systemic and an individual perspective,” Wilson says.
IN this paperexplored the question of when randomness can improve fairness. They based their analysis on the ideas of philosopher John Broome, who wrote about the value of using lotteries to allocate sparse resources in a way that honors all of the claims of individuals.
A person’s claim to a sparse resource, such as a kidney transplant, may be based on merit, deservingness, or need. For example, everyone has a right to life, and a claim to a kidney transplant may be based on that right, Wilson explains.
“Once you recognize that people have different claims to these limited resources, fairness requires that we honor all of an individual’s claims. If we always give the resource to someone with a stronger claim, is that fair?” Jain says.
This type of deterministic allocation can cause systemic exclusion or exacerbate patterned inequalities, which occur when receiving one allocation increases the probability of an individual receiving future allocations. Furthermore, machine learning models can make mistakes, and deterministic approaches can cause the same mistake to be repeated.
Randomization can overcome these problems, but this does not mean that all decisions made by the model should be equally random.
Structured randomization
Scientists operate a weighted lottery to adjust the level of randomization based on the amount of uncertainty involved in the model’s decision. A decision that is less certain should involve more randomization.
“In kidney allocation, planning is usually based on life expectancy, and that is deeply uncertain. If two patients are just five years apart, that becomes much more difficult to measure. We want to use that level of uncertainty to adjust randomization,” Wilson says.
The researchers used methods to quantify statistical uncertainty to determine how much randomization is needed in different situations. They show that calibrated randomization can lead to fairer outcomes for individuals without significantly affecting the utility or effectiveness of the model.
“There is a balance to be struck between overall utility and respect for the rights of individuals receiving limited resources, but often the trade-off is relatively small,” Wilson says.
However, researchers emphasize that there are situations in which random decision-making will not improve justice and may even harm individuals, for example in the context of the criminal justice system.
But there may be other areas where randomization could improve fairness, such as college admissions, and the researchers plan to explore other operate cases in future work. They also want to investigate how randomization might affect other factors, such as competition or pricing, and how it could be used to improve the robustness of machine learning models.
“We hope that our paper is a first step toward illustrating that randomization can be beneficial. We’re offering randomization as a tool. How much you want to do that will depend on all the stakeholders in the allocation. And of course, how they make that decision is a whole other research question,” Wilson says.