In recent times, the volume of video surveillance data has been rapidly increasing, and processing is required to detect various kinds of activities in these data. Most human activity recognition methods primarily focus on detecting normal human activities; however, identifying and detecting suspicious activities is equally important. We propose a two-stream framework for detecting abnormal human activities in real-time videos. The proposed method uses the first stream to extract features related to spatial characteristics and the other stream to extract the temporal features from the videos. The spatial features focus on the contextual details in the frame, including color, texture, and other robust features. The temporal features focus on the features related to the changes in frames with respect to the time, which play a vital role in the detection of activity. The proposed framework leverages the Xception model to extract the features in the spatial and temporal streams. The extracted features are integrated for further processing and detection. To process the integrated features and detect the activity, we train a multi-layer bidirectional long short term memory network using the extracted features. The proposed framework is trained in an end-to-end manner to accurately estimate human activity by learning patterns and forming correlations. The effectiveness of the proposed approach is evaluated by assessing the popular UCF Crime dataset, which encompasses a variety of criminal activities, and using performance measures such as recall, precision, and accuracy. Experimental results indicate that the proposed approach outperforms other contemporary methods, achieving an accuracy of 86%. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Video
Education and training
Feature extraction
Optical flow
RGB color model
Neural networks
Action recognition