Hongbo Cui1, Tao Feng1, and Jinhui Zheng2This email address is being protected from spambots. You need JavaScript enabled to view it.
1Department of Physical Education, Harbin Finance University, Harbin, 150000, China
2Harbin sport university, Harbin, 150000, China
Received: August 4, 2023 Accepted: September 29, 2024 Publication Date: October 26, 2024
Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.
In order to solve the problem of missing detection and false detection caused by the inaccuracy of motion feature extraction in the existing video key frame extraction algorithms, a reinforcement learning and feature fusion for key frame extraction algorithm and video stream classification is proposed. The fusion features are obtained by combining the original statistical features via S-GCN and ResNet50 network. Some fusion features are more effective than the original statistical features. Therefore, in order to extract useful information for classification, the original features and fusion features are combined to produce composite features. At the same time, the number of features increases and there are redundant and irrelevant features. Embedded feature selection method and random forest classifier are used to select the best feature subset. Finally, the attention mechanism is used to calculate the importance of video frames, and reinforcement learning is used to extract and optimize key frames. The experimental results show that the new algorithm can solve the problem of error detection in the key frame extraction of motion video, and performs well in the detection of video frames containing key actions. The algorithm has high accuracy and strong stability.
[1] Z. Lin, S. Geng, R. Zhang, P. Gao, G. De Melo, X. Wang, J. Dai, Y. Qiao, and H. Li. “Frozen clip models are efficient video learners”. In: European Conference on Computer Vision. Springer. 2022, 388–404. DOI: 10.1007/978-3-031-19833-5_23.
[2] S. Yin, H. Li, A. A. Laghari, T. R. Gadekallu, G. A. Sampedro, and A. Almadhor, (2024) “An anomaly detection model based on deep auto-encoder and capsule graph convolution via sparrow search algorithm in 6G internet-of-everything" IEEE Internet of Things Journal 11(18): 29402–29411. DOI: 10.1109/JIOT.2024.3353337.
[3] B. Omarov, S. Narynov, Z. Zhumanov, A. Gumar, and M. Khassanova, (2022) “State-of-the-art violence detection techniques in video surveillance security systems: a systematic review" PeerJ Computer Science 8: e920. DOI: 10.7717/peerj-cs.920.
[4] S. Yin, (2023) “Object Detection Based on Deep Learning: A Brief Review" IJLAI Transactions on Science and Engineering 1(2): 1–6.
[5] R. Pugliese, S. Regondi, and R. Marini, (2021) “Machine learning-based approach: Global trends, research directions, and regulatory standpoints" Data Science and Management 4: 19–29. DOI: 10.1016/j.dsm.2021.12.002.
[6] I. H. Sarker, (2021) “Machine learning: Algorithms, realworld applications and research directions" SN computer science 2(3): 160. DOI: 10.1007/s42979-021-00592-x.
[7] X. Liu, S. Wang, S. Lu, Z. Yin, X. Li, L. Yin, J. Tian, and W. Zheng, (2023) “Adapting feature selection algorithms for the classification of Chinese texts" Systems 11(9): 483. DOI: 10.3390/systems11090483.
[8] C. Tang, X. Zheng, W. Zhang, X. Liu, X. Zhu, and E. Zhu, (2023) “Unsupervised feature selection via multiple graph fusion and feature weight learning" Science China Information Sciences 66(5): 152101. DOI: 10.1007/s11432-022-3579-1.
[9] Y. Wang, S. Li, C. Liu, K. Wang, X. Yuan, C. Yang, and W. Gui, (2023) “Multiscale feature fusion and semisupervised temporal-spatial learning for performance monitoring in the flotation industrial process" IEEE Transactions on Cybernetics 54(2): 974–987. DOI: 10.1109/TCYB.2023.3295852.
[10] J.-C. See, H.-F. Ng, H.-K. Tan, J.-J. Chang, K.-M. Mok, W.-K. Lee, and C.-Y. Lin, (2023) “Cryptensor: A resource-shared co-processor to accelerate convolutional neural network and polynomial convolution" IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (12): 4735–4748. DOI: 10.1109/TCAD.2023.3296375.
[11] J. Chen, (2023) “A multi granularity information fusion text classification model based on attention mechanism" Journal of Intelligent & Fuzzy Systems 45(5): 7631–7645. DOI: 10.3233/JIFS-233388.
[12] L. Chunna, F. Hai, and G. Chunlin, (2020) “Development of an efficient global optimization method based on adaptive infilling for structure optimization" Structural and Multidisciplinary Optimization 62(6): 3383–3412. DOI: 10.1007/s00158-020-02716-y.
[13] A. Godavari, C. Sudhakar, and T. Ramesh, (2020) “Hybrid deduplication system—A block-level similaritybased approach" IEEE Systems Journal 15(3): 3860– 3870. DOI: 10.1109/JSYST.2020.3012702.
[14] S. Hu, J. Liu, R. Yang, Y. Wang, A. Wang, K. Li, W. Liu, and C. Yang, (2023) “Exploring the applicability of transfer learning and feature engineering in epilepsy prediction using hybrid transformer model" IEEE Transactions on Neural Systems and Rehabilitation Engineering 31: 1321–1332. DOI: 10.1109/TNSRE.2023.3244045.
[15] Y. Shen, J. Xu, J. Yi, E. Chen, and V. Chen, (2022) “Class-E power amplifiers incorporating fingerprint augmentation with combinatorial security primitives for machine-learning-based authentication in 65 nm CMOS" IEEE Transactions on Circuits and Systems I: Regular Papers 69(5): 1896–1909. DOI: 10.1109/TCSI.2022.3141336.
[16] M. Jabari, K. Rezaee, and M. Zakeri, (2023) “Fusing handcrafted and deep features for multi-class cardiac diagnostic decision support model based on heart sound signals" Journal of Ambient Intelligence and Humanized Computing 14(3): 2873–2885. DOI: 10.1007/s12652-023-04528-6.
[17] Y. Jiang and S. Yin, (2023) “Heterogenous-view occluded expression data recognition based on cycle-consistent adversarial network and K-SVD dictionary learning under intelligent cooperative robot environment" Computer Science and Information Systems 20(4): 1869–1883. DOI: 10.2298/CSIS221228034J.
[18] L. Wang, Y. Shoulin, H. Alyami, A. A. Laghari, M. Rashid, J. Almotiri, H. J. Alyamani, and F. Alturise. A novel deep learning-based single shot multibox detector model for object detection in optical remote sensing images. 2024. DOI: 10.1002/gdj3.162.
[19] L. Teng, Y. Qiao, M. Shafiq, G. Srivastava, A. R. Javed, T. R. Gadekallu, and S. Yin, (2023) “FLPK-BiSeNet: Federated learning based on priori knowledge and bilateral segmentation network for image edge extraction" IEEE Transactions on Network and Service Management 20(2): 1529–1542. DOI: 10.1109/TNSM.2023.3273991.
[20] H. Tu, W. Wang, J. Chen, F. Wu, and G. Li, (2022) “Unpaired image-to-image translation with improved twodimensional feature" Multimedia Tools and Applications 81(30): 43851–43872. DOI: 10.1007/s11042-022-13115-4.
[21] J. Lu, X. Ouyang, X. Shen, T. Liu, Z. Cui, Q. Wang, and D. Shen, (2022) “GAN-guided deformable attention network for identifying thyroid nodules in ultrasound images" IEEE Journal of Biomedical and Health Informatics 26(4): 1582–1590. DOI: 10.1109/JBHI.2022.3153559.
[22] R. Savran Kızıltepe, J. Q. Gan, and J. J. Escobar, (2023) “A novel keyframe extraction method for video classification using deep neural networks" Neural Computing and Applications 35(34): 24513–24524. DOI: 10.1007/s00521-021-06322-x.
[23] K. Khurana and U. Deshpande, (2023) “Two stream multi-layer convolutional network for keyframe-based video summarization" Multimedia Tools and Applications 82(25): 38467–38508. DOI: 10.1007/s11042-023-14665-x.
We use cookies on this website to personalize content to improve your user experience and analyze our traffic. By using this site you agree to its use of cookies.