• search hit 2 of 8
Back to Result List

Purely attention based local feature integration for video classification

  • Recently, substantial research effort has focused on how to apply CNNs or RNNs to better capture temporal patterns in videos, so as to improve the accuracy of video classification. In this paper, we investigate the potential of a purely attention based local feature integration. Accounting for the characteristics of such features in video classification, we first propose Basic Attention Clusters (BAC), which concatenates the output of multiple attention units applied in parallel, and introduce a shifting operation to capture more diverse signals. Experiments show that BAC can achieve excellent results on multiple datasets. However, BAC treats all feature channels as an indivisible whole, which is suboptimal for achieving a finer-grained local feature integration over the channel dimension. Additionally, it treats the entire local feature sequence as an unordered set, thus ignoring the sequential relationships. To improve over BAC, we further propose the channel pyramid attention schema by splitting features into sub-features atRecently, substantial research effort has focused on how to apply CNNs or RNNs to better capture temporal patterns in videos, so as to improve the accuracy of video classification. In this paper, we investigate the potential of a purely attention based local feature integration. Accounting for the characteristics of such features in video classification, we first propose Basic Attention Clusters (BAC), which concatenates the output of multiple attention units applied in parallel, and introduce a shifting operation to capture more diverse signals. Experiments show that BAC can achieve excellent results on multiple datasets. However, BAC treats all feature channels as an indivisible whole, which is suboptimal for achieving a finer-grained local feature integration over the channel dimension. Additionally, it treats the entire local feature sequence as an unordered set, thus ignoring the sequential relationships. To improve over BAC, we further propose the channel pyramid attention schema by splitting features into sub-features at multiple scales for coarse-to-fine sub-feature interaction modeling, and propose the temporal pyramid attention schema by dividing the feature sequences into ordered sub-sequences of multiple lengths to account for the sequential order. Our final model pyramidxpyramid attention clusters (PPAC) combines both channel pyramid attention and temporal pyramid attention to focus on the most important sub-features, while also preserving the temporal information of the video. We demonstrate the effectiveness of PPAC on seven real-world video classification datasets. Our model achieves competitive results across all of these, showing that our proposed framework can consistently outperform the existing local feature integration methods across a range of different scenarios.show moreshow less

Export metadata

Additional Services

Search Google Scholar Statistics
Metadaten
Author details:Xiang LongORCiD, Gerard de MeloORCiDGND, Dongliang He, Fu Li, Zhizhen Chi, Shilei Wen, Chuang GanORCiD
DOI:https://doi.org/10.1109/TPAMI.2020.3029554
ISSN:0162-8828
ISSN:1939-3539
ISSN:2160-9292
Pubmed ID:https://pubmed.ncbi.nlm.nih.gov/33026984
Title of parent work (English):IEEE Transactions on Pattern Analysis and Machine Intelligence
Publisher:Inst. of Electr. and Electronics Engineers
Place of publishing:Los Alamitos
Publication type:Article
Language:English
Date of first publication:2020/10/07
Publication year:2020
Release date:2024/01/05
Tag:Algorithms; Computational modeling; Computer; Convolution; Feature extraction; Neural Networks; Plugs; Task analysis; Three-dimensional displays; Two dimensional displays; Video classification; action recognition; attention mechanism; computer vision
Volume:44
Issue:4
Number of pages:15
First page:2140
Last Page:2154
Organizational units:An-Institute / Hasso-Plattner-Institut für Digital Engineering gGmbH
DDC classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 000 Informatik, Informationswissenschaft, allgemeine Werke
Peer review:Referiert
Accept ✔
This website uses technically necessary session cookies. By continuing to use the website, you agree to this. You can find our privacy policy here.