المجلة الدولية للعلوم والتقنية
International Science and Technology Journal
مجلة علمية محكمة ينشرها
تحت إشراف
مجلة مفتوحة الوصول
ISSN: 2519-9854 (Online)
ISSN: 2519-9846 (Print)
مجلة علمية محكّمة تهتم بنشر البحوث والدراسات في مجال العلوم التطبيقية، تصدر دورياً تحت إشراف نخبة من الأساتذة
Human Action Detection Using A hybrid Architecture of CNN and Transformer......
www.doi.org/10.62341/bsmh2119
الباحث(ون): | - Bassma .A. Awad Abdlrazg
- Sumaia Masoud
- Mnal .M. Ali
|
المؤسسة: | University of Omar Al-Mokhtar - Faculty of Science
Department of Mathematics |
المجال: | العلوم العامة: الرياضيات و الاحصاء و الفيزياء |
منشور في: | العدد الرابع و الثلاثون - أبريل 2024 |
الملخص
Abstract
Abstract:
This work presents a Deep learning and Vision Transformer hybrid sequence model for the classification and identification of Human Motion Actions. The deep learning model works by extracting Spatial-temporal features from the features of every video, and then we use a CNN model that takes these inputs as spatial features map from videos and outputs them as a sequence of features. These sequences will be temporally fed into the Vision Transformer (ViT) which classifies the videos used into 7 different classes: Jump, Walk, Wave1, wave2, Bend, Jack, and powerful jump. The model was trained and tested on the Weismann dataset and the results showed that such a model was accurately capable of identifying the
human actions.
Keywords: Deep Learning, Vision Transformer, Human Motion Action Detection, Spatial features, CNN.