Skip to main navigation Skip to search Skip to main content

SFK: Shape- and Function-Grounded Keypoint Representation for Sequential Manipulation

  • Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Sequential manipulation is the process by which robots perform multiple interdependent steps to accomplish composite tasks, demanding tight integration of perception, planning and execution. Existing methods incorporate explicit features such as category, semantics, 6D pose or affordance to enhance consistency, yet single-feature representations limit generalisation and adaptability across diverse task stages. To overcome these limitations, we propose a shape- and function-grounded keypoint (SFK) representation that provides interpretable, task-relevant object abstractions. The SFK representation preserves topological consistency, enabling the inference of functional semantics, 6D pose, spatial constraints and task-oriented grasp configurations. Building upon this, a keypoint-driven hierarchical framework is developed to unify perception, planning and execution, thereby enabling task-level action reasoning conditioned on keypoint representations and coherent multistep action execution. For evaluation, a keypoint dataset containing over 1600 RGB images of 29 household objects is constructed, and extensive experiments demonstrate task-relevant keypoint generation with reduced computational time, as well as higher task execution success rates and broader task coverage.

Original languageEnglish
JournalCAAI Transactions on Intelligence Technology
DOIs
StateAccepted/In press - 2026

Keywords

  • hierarchical framework
  • keypoint representation
  • sequential manipulation

Fingerprint

Dive into the research topics of 'SFK: Shape- and Function-Grounded Keypoint Representation for Sequential Manipulation'. Together they form a unique fingerprint.

Cite this