minAction.net: Energy-First Neural Architecture Design -- From Biological Principles to Systematic Validation

minAction.net: Energy-First Neural Architecture Design — From Biological Principles to Systematic Validation

minAction.net:能量优先的神经网络架构设计——从生物学原理到系统性验证

Modern machine learning optimizes for accuracy without explicitly accounting for internal computational cost, even though physical and biological systems operate under intrinsic energy constraints. We evaluate energy-aware learning across 2,203 experiments spanning vision, text, neuromorphic, and physiological datasets, using 10 seeds per configuration and performing a factorial statistical analysis. 现代机器学习在优化准确率时,往往没有明确考虑内部计算成本,尽管物理和生物系统都是在固有的能量约束下运行的。我们通过 2,203 次实验评估了能量感知学习,涵盖了视觉、文本、神经形态和生理数据集,每种配置使用 10 个随机种子,并进行了因子统计分析。

Three findings emerge. First, architecture alone explains negligible variance in accuracy (partial eta^2 = 0.001). In contrast, the architecture x dataset interaction is large (partial eta^2 = 0.44, p < 0.001), demonstrating that optimal architecture depends critically on task modality and rejecting the assumption of a universal best architecture. 研究得出了三个结论。首先,仅凭架构本身对准确率差异的解释微乎其微(偏 eta^2 = 0.001)。相比之下,“架构 x 数据集”的交互作用显著(偏 eta^2 = 0.44, p < 0.001),这表明最优架构在很大程度上取决于任务模态,从而否定了存在“通用最佳架构”的假设。

Second, a controlled lambda-sweep over four orders of magnitude validates a single-parameter energy-regularized objective L = L_CE + lambda * E(theta, x): internal activation energy decreases to 6% of baseline at moderate lambda with no accuracy degradation on MNIST. 其次,通过对四个数量级的 lambda 进行受控扫描,验证了一个单参数能量正则化目标函数 L = L_CE + lambda * E(theta, x):在适中的 lambda 值下,MNIST 数据集上的内部激活能量降至基准的 6%,且准确率没有下降。

Third, energy-first architectures inspired by an action-principle framework yield 5-33% within-modality training-efficiency gains over conventional baselines. These results emerge from a research program that interprets learning through a structural correspondence between the action functional in classical mechanics, free energy in statistical physics, and KL-regularized objectives in variational inference. We frame this correspondence as a design hypothesis rather than a derivation. 第三,受作用量原理框架启发的“能量优先”架构,在同模态训练效率上比传统基准提升了 5-33%。这些结果源于一项研究计划,该计划通过经典力学中的作用量泛函、统计物理中的自由能以及变分推理中的 KL 正则化目标之间的结构对应关系来解释学习过程。我们将这种对应关系视为一种设计假设,而非推导结论。