OSCS-SupCon: Orthogonal Sigmoid-based Common and Style Supervised Contrastive Learning for Robust Feature Disentanglement

OSCS-SupCon：用于鲁棒特征解耦的基于正交 Sigmoid 的共性与风格监督对比学习

Abstract: Supervised Contrastive Learning (SupCon) has achieved strong performance by explicitly modeling pairwise relationships among samples. However, existing SupCon-based methods suffer from two key limitations: negative-sample dilution induced by the standard InfoNCE loss, and feature-space entanglement caused by the lack of explicit constraints separating category-relevant (common) and category-irrelevant (style) features. These limitations reduce feature discriminability and generalization ability.

摘要： 监督对比学习（SupCon）通过显式建模样本间的成对关系，取得了优异的性能。然而，现有的基于 SupCon 的方法存在两个主要局限：标准 InfoNCE 损失导致的负样本稀释问题，以及由于缺乏显式约束来分离类别相关（共性）和类别无关（风格）特征而导致的特征空间纠缠问题。这些局限性降低了特征的可判别性和泛化能力。

To address these issues, we propose OSCS-SupCon (Orthogonal Sigmoid-based Common and Style Supervised Contrastive Learning), a unified framework that combines a sigmoid-based pairwise contrastive objective with explicit orthogonality constraints. Specifically, we introduce a sigmoid-based contrastive loss with two learnable parameters, temperature and bias, which adaptively modulate pairwise decision boundaries and alleviate negative-sample dilution.

为了解决这些问题，我们提出了 OSCS-SupCon（基于正交 Sigmoid 的共性与风格监督对比学习），这是一个将基于 Sigmoid 的成对对比目标与显式正交约束相结合的统一框架。具体而言，我们引入了一种带有温度和偏置这两个可学习参数的 Sigmoid 对比损失，它能够自适应地调节成对决策边界，并缓解负样本稀释问题。

Furthermore, we enforce orthogonality between common and style feature subspaces via a linear projection with ReLU nonlinearity, thereby reducing feature overlap and improving disentanglement of style-irrelevant representations. Extensive experiments on six benchmark datasets demonstrate that OSCS-SupCon consistently outperforms state-of-the-art supervised contrastive learning methods across multiple backbone architectures.

此外，我们通过带有 ReLU 非线性的线性投影，强制要求共性特征子空间与风格特征子空间之间保持正交，从而减少特征重叠，并改善了风格无关表征的解耦效果。在六个基准数据集上的大量实验表明，OSCS-SupCon 在多种骨干架构下均持续优于最先进的监督对比学习方法。

In particular, on the fine-grained CUB200-2011 dataset with a ResNet-18 backbone, the proposed method achieves a 3.4% improvement in classification accuracy over CS-SupCon, highlighting its robustness and generalization capability. Ablation studies further confirm the effectiveness of each component.

特别是在使用 ResNet-18 骨干网络的细粒度 CUB200-2011 数据集上，该方法在分类准确率上比 CS-SupCon 提升了 3.4%，凸显了其鲁棒性和泛化能力。消融实验进一步证实了每个组件的有效性。