Skip to main navigation Skip to search Skip to main content

Monocular, One-stage, Regression of Multiple 3D People

  • Yu Sun
  • , Qian Bao
  • , Wu Liu*
  • , Yili Fu*
  • , Michael J. Black
  • , Tao Mei
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • JD AI Research
  • Max Planck Institute for Intelligent Systems

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper focuses on the regression of multiple 3D people from a single RGB image. Existing approaches predominantly follow a multi-stage pipeline that first detects people in bounding boxes and then independently regresses their 3D body meshes. In contrast, we propose to Regress all meshes in a One-stage fashion for Multiple 3D People (termed ROMP). The approach is conceptually simple, bounding box-free, and able to learn a per-pixel representation in an end-to-end manner. Our method simultaneously predicts a Body Center heatmap and a Mesh Parameter map, which can jointly describe the 3D body mesh on the pixel level. Through a body-center-guided sampling process, the body mesh parameters of all people in the image are easily extracted from the Mesh Parameter map. Equipped with such a fine-grained representation, our one-stage framework is free of the complex multi-stage process and more robust to occlusion. Compared with state-of-the-art methods, ROMP achieves superior performance on the challenging multi-person benchmarks, including 3DPW and CMU Panoptic. Experiments on crowded/occluded datasets demonstrate the robustness under various types of occlusion. The code, released at https://github.com/Arthur151/ROMP, is the first real-time implementation of monocular multi-person 3D mesh regression.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages11159-11168
Number of pages10
ISBN (Electronic)9781665428125
DOIs
StatePublished - 2021
Event18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 - Virtual, Online, Canada
Duration: 11 Oct 202117 Oct 2021

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
ISSN (Print)1550-5499

Conference

Conference18th IEEE/CVF International Conference on Computer Vision, ICCV 2021
Country/TerritoryCanada
CityVirtual, Online
Period11/10/2117/10/21

Fingerprint

Dive into the research topics of 'Monocular, One-stage, Regression of Multiple 3D People'. Together they form a unique fingerprint.

Cite this