Cloud Is Closer Than It Appears: Revisiting the Tradeoffs of Distributed Real-Time Inference

云端比看起来更近：重新审视分布式实时推理的权衡

Abstract: The increasing deployment of deep neural networks (DNNs) in cyber-physical systems (CPS) enhances perception fidelity, but imposes substantial computational demands on execution platforms, posing challenges to real-time control deadlines. 摘要： 深度神经网络（DNN）在信息物理系统（CPS）中的日益普及提升了感知保真度，但也给执行平台带来了巨大的计算需求，从而对实时控制的截止时间构成了挑战。

Traditional distributed CPS architectures typically favor on-device inference to avoid network variability and contention-induced delays on remote platforms. However, this design choice places significant energy and computational demands on the local hardware. 传统的分布式 CPS 架构通常倾向于采用设备端推理，以避免网络波动和远程平台上的竞争延迟。然而，这种设计选择给本地硬件带来了沉重的能源和计算负担。

In this work, we revisit the assumption that cloud-based inference is intrinsically unsuitable for latency-sensitive control tasks. We demonstrate that, when provisioned with high-throughput compute resources, cloud platforms can effectively amortize network and queueing delays, enabling them to match or surpass on-device performance for real-time decision-making. 在这项工作中，我们重新审视了“基于云的推理本质上不适合延迟敏感型控制任务”这一假设。我们证明，当配置了高吞吐量的计算资源时，云平台可以有效摊销网络和排队延迟，使其在实时决策方面能够达到或超过设备端的性能。

Specifically, we develop a formal analytical model that characterizes distributed inference latency as a function of the sensing frequency, platform throughput, network delay, and task-specific safety constraints. We instantiate this model in the context of emergency braking for autonomous driving and validate it through extensive simulations using real-time vehicular dynamics. 具体而言，我们开发了一个形式化分析模型，将分布式推理延迟表征为感知频率、平台吞吐量、网络延迟和特定任务安全约束的函数。我们将该模型应用于自动驾驶紧急制动场景，并通过使用实时车辆动力学的广泛模拟进行了验证。

Our empirical results identify concrete conditions under which cloud-based inference adheres to safety margins more reliably than its on-device counterpart. These findings challenge prevailing design strategies and suggest that the cloud is not merely a feasible option, but often the preferred inference location for distributed CPS architectures. In this light, the cloud is not as distant as traditionally perceived; in fact, it is closer than it appears. 我们的实证结果确定了在特定条件下，基于云的推理比设备端推理能更可靠地遵守安全裕度。这些发现挑战了主流的设计策略，并表明云端不仅是一个可行的选择，而且往往是分布式 CPS 架构的首选推理位置。从这个角度来看，云端并不像传统观念中那样遥远；事实上，它比看起来更近。