Lost in the Tail: Addressing Geographic Imbalance in Urban Visual Place Recognition
Lost in the Tail: Addressing Geographic Imbalance in Urban Visual Place Recognition
迷失在长尾中:解决城市视觉地点识别中的地理不平衡问题
Urban-scale Visual Place Recognition (VPR) aims to identify the geographic location of a query image by matching it against a geo-tagged database. While recent methods achieve impressive performance, they overlook a serious long-tailed problem hidden in urban-scale datasets, which biases the model towards locations with abundant images and ignores less-visited areas, causing models to systematically favor frequently photographed locations while failing in sparsely covered areas.
城市级视觉地点识别(VPR)旨在通过将查询图像与带有地理标签的数据库进行匹配,来确定其地理位置。尽管近期的方法取得了令人瞩目的性能,但它们忽略了隐藏在城市级数据集中的严重长尾问题。该问题导致模型偏向于图像丰富的地点,而忽视了访问量较少的区域,使得模型系统性地偏好频繁拍摄的地点,却在覆盖稀疏的区域表现不佳。
In this paper, we systematically characterize this imbalance challenge and propose Distribution-Aware Place Recognition (DAPR), a model-agnostic plug-in framework that rebalances gradient contributions across head and tail classes. Additionally, within classification-retrieval pipelines, DAPR applies a multi-scale distance search mechanism to compute per-class distributional compactness, providing complementary gains at the retrieval stage.
在本文中,我们系统地刻画了这一不平衡挑战,并提出了“分布感知地点识别”(Distribution-Aware Place Recognition, DAPR)。这是一个与模型无关的插件框架,能够重新平衡头部类别和尾部类别之间的梯度贡献。此外,在分类-检索流水线中,DAPR 应用了一种多尺度距离搜索机制来计算各类的分布紧凑度,从而在检索阶段提供互补的性能增益。
On the large-scale SF-XL benchmark, our framework outperforms the previous classification-retrieval baseline by 18.3% on test set v1, and 6.7% on test set v2. As a plug-in module, it achieves consistent improvements across representative VPR methods on SF-XL, MSLS, and Pitts30k, demonstrating broad generalizability across different methods and benchmarks.
在大型 SF-XL 基准测试中,我们的框架在测试集 v1 上比之前的分类-检索基线提高了 18.3%,在测试集 v2 上提高了 6.7%。作为一个插件模块,它在 SF-XL、MSLS 和 Pitts30k 等代表性 VPR 方法上均实现了持续的性能提升,证明了其在不同方法和基准测试中具有广泛的通用性。