Explaining RhythmFormer: A Systematic XAI Analysis of Periodic Sparse Attention for Remote Photoplethysmography

解读 RhythmFormer：针对远程光电容积脉搏波描记法（rPPG）中周期性稀疏注意力机制的系统性可解释人工智能（XAI）分析

Abstract: Remote photoplethysmography (rPPG) transformers achieve low heart-rate error on benchmarks, yet their decisions remain opaque—a growing concern as rPPG moves toward clinical heart rate estimation. Existing rPPG XAI is dominated by qualitative heatmap inspection without quantitative faithfulness metrics or physiology-grounded validation, leaving a gap between visual plausibility and auditable evidence. We address this gap.

摘要： 远程光电容积脉搏波描记法（rPPG）Transformer 模型在基准测试中实现了较低的心率误差，但其决策过程仍然不透明——随着 rPPG 向临床心率评估方向发展，这一问题日益受到关注。现有的 rPPG 可解释人工智能（XAI）研究主要依赖定性的热力图检查，缺乏定量的忠实度指标或基于生理学的验证，导致视觉上的合理性与可审计的证据之间存在脱节。我们旨在弥补这一差距。

First, we adapt four attribution methods (raw attention, rollout, flow, Beyond Intuition) to RhythmFormer’s bi-level routing attention with top-$k$ selection. Second, we introduce a skin coverage metric quantifying how much attribution mass falls on skin regions. Third, we adapt the SaCo faithfulness coefficient from its original classification setting to rPPG regression by using the MAE between original and perturbed predicted rPPG waveforms as the perturbation impact.

首先，我们将四种归因方法（原始注意力、Rollout、Flow、Beyond Intuition）适配到了 RhythmFormer 带有 top-$k$ 选择的双层路由注意力机制中。其次，我们引入了一种皮肤覆盖率指标，用于量化归因权重落在皮肤区域的比例。第三，我们将 SaCo 忠实度系数从原始的分类场景适配到 rPPG 回归任务中，通过计算原始预测 rPPG 波形与扰动后波形之间的平均绝对误差（MAE）来衡量扰动影响。

Applying these tools, we quantify a multi-hop leakage effect under sparse top-$k$ routing: attention rollout and flow almost completely restores the connections that individual refined-attention layers explicitly set to zero. Beyond Intuition mitigates this via its value-projection-weighted rollout and gradient-supported mask, attaining the highest median refined skin coverage ($0.83$ vs. $0.57$ for vanilla rollout) and faithfulness ($F=0.92$) among the evaluated methods on UBFC-rPPG. Validation across diverse datasets and model variants is needed.

应用这些工具，我们量化了稀疏 top-$k$ 路由下的多跳泄露效应：注意力 Rollout 和 Flow 方法几乎完全恢复了那些被单个精细注意力层显式置零的连接。Beyond Intuition 方法通过其值投影加权 Rollout 和梯度支持掩码缓解了这一问题，在 UBFC-rPPG 数据集上的评估中，它达到了最高的皮肤覆盖率中位数（0.83，而普通 Rollout 为 0.57）和忠实度（$F=0.92$）。未来仍需在更多样化的数据集和模型变体上进行验证。

A case study on a low-SaCo outlier further shows all four methods recovering consistently once an artefactual region is replaced, suggesting consistent SaCo behavior across attribution families in this illustrative case. Together, these metrics move XAI for rPPG toward auditable numerical evidence about spatial alignment and perturbation faithfulness, i.e. trustworthy rPPG XAI.

针对一个低 SaCo 异常值的案例研究进一步表明，一旦替换掉伪影区域，所有四种方法都能表现出一致的恢复效果，这表明在该示例中，不同归因方法族的 SaCo 行为具有一致性。总之，这些指标推动了 rPPG 的 XAI 向关于空间对齐和扰动忠实度的可审计数值证据方向发展，即实现可信的 rPPG XAI。