CFCamo: A Counterfactual Detect-or-Abstain Framework for Camouflaged Object Detection

CFCamo: A Counterfactual Detect-or-Abstain Framework for Camouflaged Object Detection

CFCamo:一种用于伪装目标检测的反事实“检测或弃权”框架

Abstract: Vision-language reinforcement learning has recently shown strong target-present localization for camouflaged object detection (COD). Yet localization is only one side of the decision: when the agent faces an ordinary image with no camouflaged target, will it still claim that a camouflaged object exists? Standard COD training and evaluation data are positive-only, so agents optimized under this setting can acquire an over-detect bias, a task-specific form of object hallucination that standard COD evaluation leaves unmeasured.

摘要: 视觉-语言强化学习最近在伪装目标检测(COD)中展现出了强大的目标定位能力。然而,定位只是决策的一方面:当智能体面对一张没有伪装目标的普通图像时,它是否仍会声称存在伪装目标?标准的 COD 训练和评估数据仅包含正样本,因此在这种设置下优化的智能体可能会产生“过度检测偏差”,这是一种特定于任务的物体幻觉,而标准的 COD 评估并未对其进行衡量。

To quantify this target-absent behavior, we construct Counterfactual COD (CF-COD), a paired benchmark that removes the camouflaged target from each held-out COD evaluation image while preserving a plausible background. CF-COD evaluates whether a model detects the target on the original image and abstains on the target-absent counterfactual, summarized by Pair Accuracy (PA).

为了量化这种目标缺失时的行为,我们构建了反事实 COD(CF-COD),这是一个配对基准测试,它从每个留出的 COD 评估图像中移除伪装目标,同时保留合理的背景。CF-COD 评估模型是否能在原始图像上检测到目标,并在目标缺失的反事实图像上选择“弃权”,这一指标总结为配对准确率(PA)。

We further introduce CFCamo, a paired counterfactual framework for COD with abstention. For training, CFCamo optimizes a Qwen3-VL-4B-Instruct agent with Counterfactual Sequence Policy Optimization (CSPO), which samples paired original-counterfactual rollouts and uses a Counterfactual Paired Reward (CPR) to couple original-image detection with counterfactual abstention.

我们进一步引入了 CFCamo,这是一个用于 COD 的配对反事实框架,支持弃权机制。在训练方面,CFCamo 使用反事实序列策略优化(CSPO)来优化 Qwen3-VL-4B-Instruct 智能体,该方法通过采样配对的“原始-反事实”序列,并利用反事实配对奖励(CPR)将原始图像的检测与反事实图像的弃权行为进行耦合。

On CAMO-test, CFCamo improves S_alpha by +3.7 pp over the prior RL-based COD baseline; across CF-COD, it reaches 80.0-90.8% PA. Ablations show that removing counterfactual coupling reduces PA to 1.4-5.2% despite strong target-present COD scores, showing that target-present evaluation alone does not characterize detect-or-abstain behavior. Overall, these results indicate that CFCamo improves COD agents by coupling target-present detection with target-absent abstention, rather than merely strengthening target-present localization. Code and data are available at this https URL.

在 CAMO-test 测试集上,CFCamo 的 S_alpha 指标比之前的强化学习 COD 基线提高了 3.7 个百分点;在 CF-COD 基准上,其 PA 达到了 80.0-90.8%。消融实验表明,尽管在目标存在的情况下 COD 得分很高,但移除反事实耦合会将 PA 降低至 1.4-5.2%,这说明仅靠目标存在时的评估无法表征“检测或弃权”的行为。总的来说,这些结果表明,CFCamo 通过将目标存在时的检测与目标缺失时的弃权相结合,从而改进了 COD 智能体,而不仅仅是增强了目标定位能力。代码和数据可在该链接获取。