When processing personal data – for example the customer data in a banking application – it is important to prevent discriminatory attributes from being used as model features, because the resulting models may base their decisions on these attributes, and ultimately discriminate users.
A widely known example for misusing this practice is the COMPAS case; where discriminatory attributes were used to predict if a perpetrator was likely to recidivate.
An intuitive step for preventing discriminatory attributes from being used as features is to remove them from the training data. However, removing sensitive attributes (or not including them in the first place) is not a cure for fair machine learning, and can even exacerbate fairness issues, if used improperly.
Always be aware that there may be latent sensitive attributes. For instance, in some cases a combination of features that are not considered discriminatory by themselves can be used by a machine learning algorithm to reconstruct a discriminatory attribute. Ultimately, this would have the same effect as using discriminatory attributes directly.
In order to prevent the use of discriminatory attributes, a hybrid approach is needed; consisting of removing the attributes from the training data, testing for latent factors that may uncover them, and continuously testing for other biases such as social or subgroup bias.