Please use this identifier to cite or link to this item: https://hdl.handle.net/10316/110979
DC FieldValueLanguage
dc.contributor.authorNguyen, Huu Phong-
dc.contributor.authorRibeiro, Bernardete M.-
dc.date.accessioned2023-11-30T09:47:30Z-
dc.date.available2023-11-30T09:47:30Z-
dc.date.issued2023-09-05-
dc.identifier.issn2045-2322pt
dc.identifier.urihttps://hdl.handle.net/10316/110979-
dc.description.abstractRecognizing human actions in video sequences, known as Human Action Recognition (HAR), is a challenging task in pattern recognition. While Convolutional Neural Networks (ConvNets) have shown remarkable success in image recognition, they are not always directly applicable to HAR, as temporal features are critical for accurate classification. In this paper, we propose a novel dynamic PSO-ConvNet model for learning actions in videos, building on our recent work in image recognition. Our approach leverages a framework where the weight vector of each neural network represents the position of a particle in phase space, and particles share their current weight vectors and gradient estimates of the Loss function. To extend our approach to video, we integrate ConvNets with state-of-the-art temporal methods such as Transformer and Recurrent Neural Networks. Our experimental results on the UCF-101 dataset demonstrate substantial improvements of up to 9% in accuracy, which confirms the effectiveness of our proposed method. In addition, we conducted experiments on larger and more variety of datasets including Kinetics-400 and HMDB-51 and obtained preference for Collaborative Learning in comparison with Non-Collaborative Learning (Individual Learning). Overall, our dynamic PSO-ConvNet model provides a promising direction for improving HAR by better capturing the spatio-temporal dynamics of human actions in videos. The code is available at https://github.com/leonlha/Video-Action-Recognition-Collaborative-Learning-with-Dynamics-via-PSO-ConvNet-Transformer .pt
dc.language.isoengpt
dc.publisherSpringer Naturept
dc.relationUIDB/00326/2020pt
dc.relationUIDP/00326/2020pt
dc.rightsopenAccesspt
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/pt
dc.titleVideo action recognition collaborative learning with dynamics via PSO-ConvNet Transformerpt
dc.typearticle-
degois.publication.firstPage14624pt
degois.publication.issue1pt
degois.publication.titleScientific Reportspt
dc.peerreviewedyespt
dc.identifier.doi10.1038/s41598-023-39744-9pt
degois.publication.volume13pt
dc.date.embargo2023-09-05*
uc.date.periodoEmbargo0pt
item.grantfulltextopen-
item.cerifentitytypePublications-
item.languageiso639-1en-
item.openairetypearticle-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.fulltextCom Texto completo-
crisitem.author.researchunitCISUC - Centre for Informatics and Systems of the University of Coimbra-
crisitem.author.researchunitCISUC - Centre for Informatics and Systems of the University of Coimbra-
crisitem.author.parentresearchunitFaculty of Sciences and Technology-
crisitem.author.parentresearchunitFaculty of Sciences and Technology-
crisitem.author.orcid0000-0002-5022-0226-
crisitem.author.orcid0000-0002-9770-7672-
Appears in Collections:I&D CISUC - Artigos em Revistas Internacionais
Show simple item record

Page view(s)

60
checked on May 8, 2024

Download(s)

33
checked on May 8, 2024

Google ScholarTM

Check

Altmetric

Altmetric


This item is licensed under a Creative Commons License Creative Commons