Skip to main navigation Skip to search Skip to main content

Image to video person re-identification by learning heterogeneous dictionary pair with feature projection matrix

  • Xiaoke Zhu
  • , Xiao Yuan Jing*
  • , Xinge You
  • , Wangmeng Zuo
  • , Shiguang Shan
  • , Wei Shi Zheng
  • *Corresponding author for this work
  • Wuhan University
  • Henan University
  • Nanjing University of Posts and Telecommunications
  • Huazhong University of Science and Technology
  • School of Computer Science and Technology, Harbin Institute of Technology
  • Chinese Academy of Sciences
  • CAS - Institute of Computing Technology
  • Sun Yat-Sen University

Research output: Contribution to journalArticlepeer-review

Abstract

Person re-identification plays an important role in video surveillance and forensics applications. In many cases, person re-identification needs to be conducted between image and video clip, e.g., re-identifying a suspect from large quantities of pedestrian videos given a single image of the suspect. We call re-identification in this scenario as image to video person reidentification (IVPR). In practice, image and video are usually represented with different features, and there usually exist large variations between frames within each video. These factors make matching between image and video become a very challenging task. In this paper, we propose a joint feature projection matrix and heterogeneous dictionary pair learning (PHDL) approach for IVPR. Specifically, the PHDL jointly learns an intra-video projection matrix and a pair of heterogeneous image and video dictionaries. With the learned projection matrix, the influence caused by the variations within each video on the matching can be reduced. With the learned dictionary pair, the heterogeneous image and video features can be transformed into coding coefficients with the same dimension, such that the matching can be conducted by using the coding coefficients. Furthermore, to ensure that the obtained coding coefficients own favorable discriminability, the PHDL designs a point-to-set coefficient discriminant term. To make better use of the complementary spatial-temporal and visual appearance information contained in pedestrian video data, we further propose a multi-view PHDL approach, which can fuse different video information effectively in the dictionary learning process. Experiments on four publicly available person sequence data sets demonstrate the effectiveness of the proposed approaches.

Original languageEnglish
Pages (from-to)717-732
Number of pages16
JournalIEEE Transactions on Information Forensics and Security
Volume13
Issue number3
DOIs
StatePublished - Mar 2018
Externally publishedYes

Keywords

  • Feature projection matrix
  • Heterogeneous dictionary pair learning
  • Image to video person re-identification
  • Multi-view learning.
  • Person re-identification

Fingerprint

Dive into the research topics of 'Image to video person re-identification by learning heterogeneous dictionary pair with feature projection matrix'. Together they form a unique fingerprint.

Cite this