Skip to main navigation Skip to search Skip to main content

A descriptive behavior intention inference framework using spatio-temporal semantic features for human–robot real-time interaction

  • Liangliang Wang
  • , Guanglei Huo
  • , Ruifeng Li
  • , Peidong Liang*
  • *Corresponding author for this work
  • Wuhan University of Technology
  • Fujian(Quanzhou)-HIT Research Institute of Engineering and Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Visual behavior intention inference is crucial for enabling escort robots to interact naturally with humans, which is very challenging due to the big inner-class similarity and the small intra-class distinguishability of successive actions in the assistive scenario. To attain a reliable behavior intention inference, not only the current state of behaviors is concerned, but also the semantic information in both spatial and temporal domains plays an important role. This paper presents a segmentation–detection–recognition hierarchical system to represent the spatio-temporal semantic features for formulating descriptions of body parts, trajectories and deep relationships of sub-behaviors. Specifically, a dense trajectory matching scheme based on temporal sampling and Binarized Normed Gradients (BING) algorithm is formulated to segment the 3-Dimensional (3D) behavior cubes, based on which, local trajectories are obtained by clustering dense trajectories according to the distance similarity, and the body parts are then detected by multi-kernel learning of the encoded local features. Moreover, a global three-stream context Convolutional Neural Networks (CNN) is proposed for behavior classification by designing a texture module using expansion, connection and 1D convolution implementations. Based on transfer learning, scene information is also recognized efficiently. Finally, the semantic descriptors are modeled by two cascaded And-Or Graphs (AoGs) constraining the spatial scenarios and temporal sequences. Our unified approach is demonstrated on two public benchmarks containing long-term activities and on an escort robot for real-world applications.

Original languageEnglish
Article number107488
JournalEngineering Applications of Artificial Intelligence
Volume128
DOIs
StatePublished - Feb 2024

Keywords

  • Behavior intention inference
  • Human–robot visual interaction
  • Spatio-temporal semantic representation

Fingerprint

Dive into the research topics of 'A descriptive behavior intention inference framework using spatio-temporal semantic features for human–robot real-time interaction'. Together they form a unique fingerprint.

Cite this