Skip to main navigation Skip to search Skip to main content

LiteDepth: Digging into Fast and Accurate Depth Estimation on Mobile Devices

  • Harbin Institute of Technology
  • University of Science and Technology of China

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Monocular depth estimation is an essential task in the computer vision community. While tremendous successful methods have obtained excellent results, most of them are computationally expensive and not applicable for real-time on-device inference. In this paper, we aim to address more practical applications of monocular depth estimation, where the solution should consider not only the precision but also the inference time on mobile devices. To this end, we first develop an end-to-end learning-based model with a tiny weight size (1.4MB) and a short inference time (27FPS on Raspberry Pi 4). Then, we propose a simple yet effective data augmentation strategy, called R 2 crop, to boost the model performance. Moreover, we observe that the simple lightweight model trained with only one single loss term will suffer from performance bottleneck. To alleviate this issue, we adopt multiple loss terms to provide sufficient constraints during the training stage. Furthermore, with a simple dynamic re-weight strategy, we can avoid the time-consuming hyper-parameter choice of loss terms. Finally, we adopt the structure-aware distillation to further improve the model performance. Notably, our solution named LiteDepth ranks 2 nd in the MAI &AIM2022 Monocular Depth Estimation Challenge, with a si-RMSE of 0.311, an RMSE of 3.79, and the inference time is 37ms tested on the Raspberry Pi 4. Notably, we provide the fastest solution to the challenge. Codes and models will be released at https://github.com/zhyever/LiteDepth.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2022 Workshops, Proceedings
EditorsLeonid Karlinsky, Tomer Michaeli, Ko Nishino
PublisherSpringer Science and Business Media Deutschland GmbH
Pages507-523
Number of pages17
ISBN (Print)9783031250620
DOIs
StatePublished - 2023
EventWorkshops held at the 17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel
Duration: 23 Oct 202227 Oct 2022

Publication series

NameLecture Notes in Computer Science
Volume13802 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceWorkshops held at the 17th European Conference on Computer Vision, ECCV 2022
Country/TerritoryIsrael
CityTel Aviv
Period23/10/2227/10/22

Keywords

  • Data augmentation
  • Lightweight network
  • Monocular depth estimation
  • Multiple loss

Fingerprint

Dive into the research topics of 'LiteDepth: Digging into Fast and Accurate Depth Estimation on Mobile Devices'. Together they form a unique fingerprint.

Cite this