Comprehensive Action Quality Assessment Through Multi-Branch Modeling

  • Siyuan Xu
  • , Peilin Chen
  • , Yue Liu
  • , Meng Wang
  • , Shiqi Wang*
  • , Hong Yan
  • , Sam Kwong*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Action Quality Assessment (AQA) aims to evaluate and score human actions in videos accurately. Existing approaches involve extracting features from the input video and implementing regression based on those features. However, representations derived from a single branch often lack the necessary diversity and flexibility to capture the complexity of human actions effectively. This work addresses these limitations by introducing a multi-branch architecture designed to capture a broad spectrum of video dynamics at varying levels of granularity. Specifically, we enhance video representation in the flow-guided branch by integrating optical flow with video features. This combination of multimodal features offers a more comprehensive context of global motion. Meanwhile, the moment-focused branch is tailored to extract frame-specific features, constructing two distinct quality-based representations with different focuses on moments, which achieves adaptive clues aggregation. Furthermore, the detail-aware branch leverages multiscale deep embeddings from a hierarchy convolutional neural network to capture fine-grained spatial information, which is useful when objects have complex spatial changes. Finally, a post-fusion strategy is employed to merge outputs from all branches, contributing to the comprehensive action quality assessment. Experimental evaluations on three benchmark datasets, FineDiving, MTL-AQA, and AQA-7, demonstrate the superiority of our model in providing reliable assessments of action quality.

Original languageEnglish
Pages (from-to)8776-8789
Number of pages14
JournalIEEE Transactions on Multimedia
Volume27
DOIs
StatePublished - 2025
Externally publishedYes

Keywords

  • Action quality assessment
  • multi-branch modeling
  • multi-modal learning

Fingerprint

Dive into the research topics of 'Comprehensive Action Quality Assessment Through Multi-Branch Modeling'. Together they form a unique fingerprint.

Cite this