Abstract
Visual localization systems on edge devices rely on the reliable matching of image keypoints across different viewpoints to compute accurate relative poses. This task places three critical demands on feature descriptors: they must be highly discriminative to avoid mismatches, compact enough for minimal memory usage, and fast to compute to support real-time operation on resource-constrained platforms. In this article, we introduce a lightweight, multiscale feature-pyramid network that simultaneously detects keypoints and generates binary descriptors with minimal computational overhead. Our compact encoder–decoder backbone fuses feature maps at multiple resolutions, enabling the detection of distinctive points of interest and the production of corresponding binary embeddings in a single forward pass. A hybrid training scheme combines a dual-softmax loss with a binary Procrustes loss, integrated within a metric-learning and teacher–student distillation framework to ensure binary descriptor distinctiveness. Extensive experiments on public localization benchmarks demonstrate that our method outperforms existing binary-descriptor approaches in matching accuracy and robustness, while reducing overall model size, descriptor-bandwidth requirements, and inference latency on edge devices. The result offers efficient, high-performance solution ideally suited for visual localization applications in real-world, resource-constrained environments.
| Original language | English |
|---|---|
| Journal | IEEE/ASME Transactions on Mechatronics |
| DOIs | |
| State | Accepted/In press - 2026 |
| Externally published | Yes |
Keywords
- Binary descriptor
- edge devices
- lightweight network
- local feature
- relative pose estimation
- visual localization
Fingerprint
Dive into the research topics of 'Learning Efficient Binary Local Feature for Real-Time Visual Localization on Edge Devices'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver