Detecting and Mitigating Bias by Treating Fairness as a Symmetry Operation
Detecting and Mitigating Bias by Treating Fairness as a Symmetry Operation
通过将公平性视为对称操作来检测和缓解偏差
Abstract: Machine learning systems deployed in high stakes socioeconomic settings routinely display bias. We formalize bias as a symmetry breaking operation: a classifier is fair if its outputs remain invariant under the counterfactual operation of switching a sensitive attribute, with merit features held fixed. 摘要: 在高风险社会经济环境中部署的机器学习系统经常表现出偏差。我们将偏差形式化为一种对称性破缺操作:如果一个分类器在保持能力特征不变的情况下,对敏感属性进行反事实切换时,其输出保持不变,那么该分类器就是公平的。
We implement loss based regularization as a symmetry restoring mechanism and evaluate the framework on four synthetic datasets with varying levels of noise, correlation, and bias. The framework achieves upwards of 90% violation reduction, with accuracy costs around 5%. 我们实施了基于损失的正则化作为对称性恢复机制,并在四个具有不同噪声、相关性和偏差水平的合成数据集上评估了该框架。该框架实现了超过 90% 的违规减少,而准确率损失仅在 5% 左右。
This framework does not require causal graph knowledge, is computationally lightweight, and generalizes to any sensitive attribute definable as a bit-flip, making it suitable for contexts where local sources of discrimination remain absent from mainstream benchmarks. 该框架不需要因果图知识,计算轻量化,并且可以推广到任何可定义为位翻转(bit-flip)的敏感属性,这使其适用于那些主流基准测试中缺乏局部歧视来源的场景。