In order to avoid social bias in ML algorithms, it is imperative to continuously check that the training data is well balanced with respect to social attributes such as gender, ethnicity, or otherwise belonging to a specific social group.
In many cases, other data attributes (such as location or neighbourhood) can be proxies to sensitive social attributes, and may introduce latent bias. Using, or not testing for such latent biases, is a common pitfall called failure through unawareness. A common example is the one day delivery service offered by Amazon, which was biased for race; see Amazon Doesn’t Consider the Race of Its Customers. Should It?.
Social bias can be detected technically – by analysing the distributions of social factors, and avoiding (over-) or under-representation. However, technical limitations may prevent bias stemming from latent factors to be detected.
Therefore, it is important to be aware of technical limitations, and improve the social and human factors that can aid bias detection. For example, in order to strengthen your teams’ ability to detect and remove social biases, it is recommended to build diverse teams, both in terms of demographics and in terms of skill sets.
- Prevent Discriminatory Data Attributes Used As Model Features
- Assess and Manage Subgroup Bias
- Perform Risk Assessments
- Fairness MLSS 2020
- The Implicit Fairness Criterion of Unconstrained Learning
- Inherent Trade-Offs in the Fair Determination of Risk Scores
- Learning Fair Representations