Real-time human action recognition using CNN over temporal images for static video surveillance cameras

Cheng Bin Jin, Shengzhe Li, Trung Dung Do, Hakil Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

28 Scopus citations

Abstract

This paper proposes a real-time human action recognition approach to static video surveillance systems. This approach predicts human actions using temporal images and convolutional neural networks (CNN). CNN is a type of deep learning model that can automatically learn features from training videos. Although the state-of-the-art methods have shown high accuracy, they consume a lot of computational resources. Another problem is that many methods assume that exact knowledge of human positions. Moreover, most of the current methods build complex handcrafted features for specific classifiers. Therefore, these kinds of methods are difficult to apply in real-world applications. In this paper, a novel CNN model based on temporal images and a hierarchical action structure is developed for real-time human action recognition. The hierarchical action structure includes three levels: action layer, motion layer, and posture layer. The top layer represents subtle actions; the bottom layer represents posture. Each layer contains one CNN, which means that this model has three CNNs working together; layers are combined to represent many different kinds of action with a large degree of freedom. The developed approach was implemented and achieved superior performance for the ICVL action dataset; the algorithm can run at around 20 frames per second.

Original languageEnglish
Title of host publicationAdvances in Multimedia Information Processing – PCM 2015 - 16th Pacific-Rim Conference on Multimedia, Proceedings
EditorsYo-Sung Ho, Yong Man Ro, Junmo Kim, Fei Wu, Jitao Sang
PublisherSpringer Verlag
Pages330-339
Number of pages10
ISBN (Print)9783319240770
DOIs
StatePublished - 2015
Event16th Pacific-Rim Conference on Multimedia, PCM 2015 - Gwangju, Korea, Republic of
Duration: 16 Sep 201518 Sep 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9315
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th Pacific-Rim Conference on Multimedia, PCM 2015
Country/TerritoryKorea, Republic of
CityGwangju
Period16/09/1518/09/15

Bibliographical note

Publisher Copyright:
© Springer International Publishing Switzerland 2015.

Keywords

  • Action recognition
  • Convolutional neural network
  • Hierarchical action structure
  • Temporal images
  • Video surveillance

Fingerprint

Dive into the research topics of 'Real-time human action recognition using CNN over temporal images for static video surveillance cameras'. Together they form a unique fingerprint.

Cite this