My grandmother only eats 3 dishes…or does she?
I have an interesting childhood story to share. Years ago when I used to live in Hong Kong, I would regularly visit my grandmother, who would make the following dishes: chicken, fish and tofu, every time without fail. In my (then) young mind I thought those must be her favourite dishes and there’s nothing else she would eat.
A couple more years later it suddenly occurred to me that those dishes were intended for ancestral offerings, and my grandmother does eat other dishes, after all. The reason why she made those dishes was because I happened to visit when there’s a traditional Chinese holiday, hence the perfect time for making ancestral offerings.
This is just a personal story; but it’s proof that even though I am a data scientist and well trained to look at data points, I am also like everyone else – just human and use my own perspective on information. It serves as a perfect example of bias in data collection and the failure of linking seasonality to consumer preferences.
Which makes us wonder, if we tell you a model, or an algorithm, is going to decide your future based on incomplete and biased data; how would you react? Outcry? Skeptical? Indifference?
What if we tell you that is already widespread reality, and that it has been the case since we started using data to make decisions?
A recent survey by the British Computer Society found that the majority of people in the UK (53% adults) do not trust algorithms to make decisions about them. Yet, the reality is that our day to day lives are already guided by models and algorithms.
From algorithms that decide what grades students would get, to facial recognition systems used by law enforcement agencies, we should at least ask whether these models and algorithms are using data in a fair and transparent way.
Bias is persistent, and human bias comes in many forms which can creep into Machine Learning models via unrepresentative data, creating a bad cycle of automated bias.
The book Invisible Women: Exposing Data Bias in a World Designed for Men by Caroline Criado-Perez illustrates and explores how policies, medical research, technology, and other aspects of the world are largely built for and by men; and exposes the gender data gap where half of the population is systematically ignored.
Using rule-based algorithms to support decision making is commonplace. Following a ruling by the European Court of Justice (ECJ), from 2012 car insurance firms were no longer charging different premiums to men and women because of their gender. The theory from the ECJ was that taking customers’ gender into account contradicted laws on discrimination. The premiums for male drivers have actually been more expensive since then, according to market comparison tools.
This raises the question: If algorithms treat people less equally, will that make things fairer for everyone?
Gender bias is of course not the only challenge we face. A more recent example of data bias in real life is the UK’s GCSE exam grades fiasco. An algorithm which uses the school’s exam grades distribution from previous years as the most contributing factor to assign grades was scrapped after public outcry.
So, if awareness of bias is missed from the very beginning at the data collection stage, when we build a model based on biased data we will have biased results and decisions. As such, at Leap Beyond, a critical step in our product delivery lifecycle is end-user education on the possibility of such bias.
If your organisation is trying to build better awareness around bias in data and how to mitigate them, speak to us for a training course!