Abstract
Sequential manipulation is the process by which robots perform multiple interdependent steps to accomplish composite tasks, demanding tight integration of perception, planning and execution. Existing methods incorporate explicit features such as category, semantics, 6D pose or affordance to enhance consistency, yet single-feature representations limit generalisation and adaptability across diverse task stages. To overcome these limitations, we propose a shape- and function-grounded keypoint (SFK) representation that provides interpretable, task-relevant object abstractions. The SFK representation preserves topological consistency, enabling the inference of functional semantics, 6D pose, spatial constraints and task-oriented grasp configurations. Building upon this, a keypoint-driven hierarchical framework is developed to unify perception, planning and execution, thereby enabling task-level action reasoning conditioned on keypoint representations and coherent multistep action execution. For evaluation, a keypoint dataset containing over 1600 RGB images of 29 household objects is constructed, and extensive experiments demonstrate task-relevant keypoint generation with reduced computational time, as well as higher task execution success rates and broader task coverage.
| Original language | English |
|---|---|
| Journal | CAAI Transactions on Intelligence Technology |
| DOIs | |
| State | Accepted/In press - 2026 |
Keywords
- hierarchical framework
- keypoint representation
- sequential manipulation
Fingerprint
Dive into the research topics of 'SFK: Shape- and Function-Grounded Keypoint Representation for Sequential Manipulation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver