What do you do when you can't sleep?
Sometimes I watch replays of NBA games, (how about my Knicks?), and sometimes I read papers and articles that I had been meaning to get to, but for one reason or another hadn't made the time.
That is how I spent an hour or so with 'Detecting Bias in Black-Box Models Using Transparent Model Distillation', a recently published paper by researchers at Cornell, Microsoft, and Airbnb. I know, not exactly 'light' reading.
Full disclosure, I don't profess to have understood all the details and complexity of the study and research methods, but the basic premise of the research, and the problem that the researchers are looking to find a way to solve is one I do understand, and one that you should too as you think about incorporating AI technologies into workplace processes and decision support/making.
Namely, that AI technology can only be as good and as accurate as the data it’s trained on, and in many cases we end up incorporating our human biases into algorithms that have the potential to make a huge impact on people’s lives - like decisions about whom to hire and promote and reward.
In the paper, the researchers created models that mimic the ones used by some companies that created 'risk scores', the kinds of data that are used by a bank to decide whether or not to give someone a loan, or for a judicial administration to decide whether or not to give someone early parole. This first set of models is similar to the ones that these companies use themselves.
Then the researchers create a second, transparent, model that is trained on the actual outcomes that the first set of models are designed to predict - whether or not the loans were paid back and whether or not the parolee committed another crime. Importantly, these models did include data points that most of us, especially in HR, are trained to ignore - things like gender, race, and age. The researchers do this intentionally, and rather than me try to explain why that is important, read through this section of the paper where they discuss the need to assess these kinds of 'off-limits' data elements, (emphasis mine):
Sometimes we are interested in detecting bias on variables that have intentionally been excluded from the black-box model. For example, a model trained for recidivism prediction or credit scoring is probably not allowed to use race as an input to prevent the model from learning to be racially biased. Unfortunately, excluding a variable like race from the inputs does not prevent the model from learning to be biased. Racial bias in a data set is likely to be in the outcomes — the targets used for learning; removing the race input race variable does not remove the bias from the targets. If race was uncorrelated with all other variables (and combinations of variables) provided to the model as inputs, then removing the race variable would prevent the model from learning to be biased because it would not have any input variables on which to model the bias. Unfortunately, in any large, real-world data set, there is massive correlation among the high-dimensional input variables, and a model trained to predict recidivism or credit risk will learn be biased from the correlation between other input variables that must remain in the model (e.g., income, education, employment) and the excluded race variable because these other correlated variables enable the model to more accurately predict the (biased) outcome, recidivism or credit risk. Unfortunately, removing a variable like race or gender does not prevent a model from learning to be biased. Instead, removing protected variables like race or gender make it harder to detect how the model is biased because the bias is now spread in a complex way among all of the correlated variables, and also makes correcting the bias more difficult because the bias is now spread in a complex way through the model instead of being localized to the protected race or gender variables. The main benefit of removing a protected variable like race or gender from the input of a machine learning model is that it allows the group deploying the model to claim (incorrectly) that they model is not biased because it did not use the protected variable.
This is really interesting, if counter-intuitive to how most of us, (me for sure), would think about how to ensure that AI and algorithms that we want to deploy to evaluate data sets for a process meant to provide decision support for the 'Who should we interview for our software engineer opening? question.
I'm sure we've seen or heard about AI for HR solutions that profess to eliminate biases like the ones that have existed around gender, race, and even age from important HR processes by 'hiding' or removing the indicators of such protected and/or under-represented groups.
This study suggests that removing those indicators from the process and the design of the AI is exactly the wrong approach - and that large data sets and the AI itself can and will 'learn' to be biases anyway.
Really powerful and interesting stuff for sure.
As I said, I don't profess to get all the details of this research but I do know this. If I were evaluating an AI for HR tool for something like hiring decision support, I probably would ask these questions of a potential provider:
1. Do you include indicators of a candidate's race, gender, age, etc. in the AI/algorithms that you apply in order to produce your recommendations?
If their answer is 'No we don't include those indicators.'
2. Then, are you sure that your AI/algorithms aren't learning how to figure them out anyway, i.e., are still potentially biased against under-represented or protected groups?
Important questions to ask, I think.
Back to the study, (in case you don't slog all the way through it). The researchers did conclude that for both large AI tools they examined, (loan approvals and parole approvals), the existing models did still exhibit biases that they professed to have 'engineered' away. And chances are had the researchers trained their sights on one of the HR processes that AI is being deployed in, they would have found the same thing.
Have a great day!