minAction.net: Energy-First Neural Architecture Design -- From Biological Principles to Systematic Validation

minAction.net: Energy-First Neural Architecture Design — From Biological Principles to Systematic Validation

minAction.net：能量优先的神经网络架构设计——从生物学原理到系统性验证

Modern machine learning optimizes for accuracy without explicitly accounting for internal computational cost, even though physical and biological systems operate under intrinsic energy constraints. We evaluate energy-aware learning across 2,203 experiments spanning vision, text, neuromorphic, and physiological datasets, using 10 seeds per configuration and performing a factorial statistical analysis. 现代机器学习在优化准确率时，往往没有明确考虑内部计算成本，尽管物理和生物系统都是在固有的能量约束下运行的。我们通过 2,203 次实验评估了能量感知学习，涵盖了视觉、文本、神经形态和生理数据集，每种配置使用 10 个随机种子，并进行了因子统计分析。

Three findings emerge. First, architecture alone explains negligible variance in accuracy (partial eta^2 = 0.001). In contrast, the architecture x dataset interaction is large (partial eta^2 = 0.44, p < 0.001), demonstrating that optimal architecture depends critically on task modality and rejecting the assumption of a universal best architecture. 研究得出了三个结论。首先，仅凭架构本身对准确率差异的解释微乎其微（偏 eta^2 = 0.001）。相比之下，“架构 x 数据集”的交互作用显著（偏 eta^2 = 0.44, p < 0.001），这表明最优架构在很大程度上取决于任务模态，从而否定了存在“通用最佳架构”的假设。

Second, a controlled lambda-sweep over four orders of magnitude validates a single-parameter energy-regularized objective L = L_CE + lambda * E(theta, x): internal activation energy decreases to 6% of baseline at moderate lambda with no accuracy degradation on MNIST. 其次，通过对四个数量级的 lambda 进行受控扫描，验证了一个单参数能量正则化目标函数 L = L_CE + lambda * E(theta, x)：在适中的 lambda 值下，MNIST 数据集上的内部激活能量降至基准的 6%，且准确率没有下降。

Third, energy-first architectures inspired by an action-principle framework yield 5-33% within-modality training-efficiency gains over conventional baselines. These results emerge from a research program that interprets learning through a structural correspondence between the action functional in classical mechanics, free energy in statistical physics, and KL-regularized objectives in variational inference. We frame this correspondence as a design hypothesis rather than a derivation. 第三，受作用量原理框架启发的“能量优先”架构，在同模态训练效率上比传统基准提升了 5-33%。这些结果源于一项研究计划，该计划通过经典力学中的作用量泛函、统计物理中的自由能以及变分推理中的 KL 正则化目标之间的结构对应关系来解释学习过程。我们将这种对应关系视为一种设计假设，而非推导结论。