Video action recognition collaborative learning with dynamics via PSO-ConvNet Transformer

Nguyen, Huu Phong; Ribeiro, Bernardete M.

doi:10.1038/s41598-023-39744-9

Please use this identifier to cite or link to this item: https://hdl.handle.net/10316/110979

DC Field	Value	Language
dc.contributor.author	Nguyen, Huu Phong	-
dc.contributor.author	Ribeiro, Bernardete M.	-
dc.date.accessioned	2023-11-30T09:47:30Z	-
dc.date.available	2023-11-30T09:47:30Z	-
dc.date.issued	2023-09-05	-
dc.identifier.issn	2045-2322	pt
dc.identifier.uri	https://hdl.handle.net/10316/110979	-
dc.description.abstract	Recognizing human actions in video sequences, known as Human Action Recognition (HAR), is a challenging task in pattern recognition. While Convolutional Neural Networks (ConvNets) have shown remarkable success in image recognition, they are not always directly applicable to HAR, as temporal features are critical for accurate classification. In this paper, we propose a novel dynamic PSO-ConvNet model for learning actions in videos, building on our recent work in image recognition. Our approach leverages a framework where the weight vector of each neural network represents the position of a particle in phase space, and particles share their current weight vectors and gradient estimates of the Loss function. To extend our approach to video, we integrate ConvNets with state-of-the-art temporal methods such as Transformer and Recurrent Neural Networks. Our experimental results on the UCF-101 dataset demonstrate substantial improvements of up to 9% in accuracy, which confirms the effectiveness of our proposed method. In addition, we conducted experiments on larger and more variety of datasets including Kinetics-400 and HMDB-51 and obtained preference for Collaborative Learning in comparison with Non-Collaborative Learning (Individual Learning). Overall, our dynamic PSO-ConvNet model provides a promising direction for improving HAR by better capturing the spatio-temporal dynamics of human actions in videos. The code is available at https://github.com/leonlha/Video-Action-Recognition-Collaborative-Learning-with-Dynamics-via-PSO-ConvNet-Transformer .	pt
dc.language.iso	eng	pt
dc.publisher	Springer Nature	pt
dc.relation	UIDB/00326/2020	pt
dc.relation	UIDP/00326/2020	pt
dc.rights	openAccess	pt
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	pt
dc.title	Video action recognition collaborative learning with dynamics via PSO-ConvNet Transformer	pt
dc.type	article	-
degois.publication.firstPage	14624	pt
degois.publication.issue	1	pt
degois.publication.title	Scientific Reports	pt
dc.peerreviewed	yes	pt
dc.identifier.doi	10.1038/s41598-023-39744-9	pt
degois.publication.volume	13	pt
dc.date.embargo	2023-09-05	*
uc.date.periodoEmbargo	0	pt
item.grantfulltext	open	-
item.cerifentitytype	Publications	-
item.languageiso639-1	en	-
item.openairetype	article	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.fulltext	Com Texto completo	-
crisitem.author.researchunit	CISUC - Centre for Informatics and Systems of the University of Coimbra	-
crisitem.author.researchunit	CISUC - Centre for Informatics and Systems of the University of Coimbra	-
crisitem.author.parentresearchunit	Faculty of Sciences and Technology	-
crisitem.author.parentresearchunit	Faculty of Sciences and Technology	-
crisitem.author.orcid	0000-0002-5022-0226	-
crisitem.author.orcid	0000-0002-9770-7672	-
Appears in Collections:	I&D CISUC - Artigos em Revistas Internacionais

Files in This Item:

File	Description	Size	Format
Video-action-recognition-collaborative-learning-with-dynamics-via-PSOConvNet-TransformerScientific-Reports.pdf		2.61 MB	Adobe PDF	View/Open

Show simple item record

Page view(s)

60

checked on May 8, 2024

Download(s)

33

checked on May 8, 2024

Files in This Item:

Page view(s)

Download(s)

Google Scholar^TM

Altmetric

Altmetric

Files in This Item:

Page view(s)

Download(s)

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM