Agentic AI for Remote Sensing: Technical Challenges and Research Directions

面向遥感领域的智能体 AI：技术挑战与研究方向

Earth Observation (EO) is moving beyond static prediction toward multi-step analytical workflows that require coordinated reasoning over data, tools, and geospatial state. While foundation models and vision-language models have expanded representation learning and language-grounded interaction for remote sensing, and agentic AI has demonstrated long-horizon reasoning and external tool use, EO is not a straightforward extension of generic agentic AI.

地球观测（EO）正在从静态预测转向多步骤分析工作流，这需要对数据、工具和地理空间状态进行协同推理。尽管基础模型和视觉-语言模型已经扩展了遥感领域的表征学习和基于语言的交互能力，且智能体 AI（Agentic AI）已展现出长程推理和外部工具调用的潜力，但遥感领域并非通用智能体 AI 的简单延伸。

EO workflows operate over georeferenced, multi-modal, and temporally structured data, where operations such as reprojection, resampling, compositing, and aggregation actively transform the underlying state and can constrain subsequent analysis. As a result, errors may propagate silently across steps, and correctness depends not only on internal coherence, but also on geospatial consistency, temporally valid comparisons, and physical validity.

遥感工作流运行在具有地理参考、多模态和时间结构的数据之上，其中重投影、重采样、合成和聚合等操作会主动改变底层状态，并可能限制后续分析。因此，错误可能会在各个步骤中悄无声息地传播，而结果的正确性不仅取决于内部逻辑的一致性，还取决于地理空间的一致性、时间上的有效对比以及物理上的合理性。

This position paper argues that these challenges are structural rather than incidental. We identify the implicit assumptions commonly made in generic agentic models, analyze how they break in geospatial workflows, and characterize the resulting failure modes in multi-step EO pipelines. We then outline design principles for EO-native agents centered on structured geospatial state, tool-aware reasoning, verifier-guided execution, and learning objectives aligned with geospatial and physical validity.

本立场论文认为，这些挑战是结构性的，而非偶然的。我们指出了通用智能体模型中常见的隐含假设，分析了它们在地理空间工作流中失效的原因，并刻画了多步骤遥感流水线中由此产生的故障模式。随后，我们概述了面向遥感原生智能体的设计原则，重点关注结构化地理空间状态、工具感知推理、验证器引导的执行，以及与地理空间和物理有效性相一致的学习目标。

Finally, we present research directions spanning EO-specific benchmarks, hybrid supervised and reinforcement learning, constrained self-improvement, and trajectory-level evaluation beyond final-answer accuracy. Building reliable geospatial agents therefore requires rethinking agent design around the physical, geospatial, and workflow constraints that govern EO analysis.

最后，我们提出了涵盖遥感专用基准测试、监督学习与强化学习混合方法、受限自我改进，以及超越最终答案准确率的轨迹级评估等研究方向。因此，构建可靠的地理空间智能体，需要围绕制约遥感分析的物理、地理空间和工作流约束，重新思考智能体的设计。