SaliencyBERT: Recurrent Attention Network for Target-Oriented Multimodal Sentiment Classification

Jiawei Wang, Zhe Liu, Victor Sheng, Yuqing Song, Chenjian Qiu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations


As multimodal data become increasingly popular on social media platforms, it is desirable to enhance text-based approaches with other important data sources (e.g. images) for the Sentiment Classification of social media posts. However, existing approaches primarily rely on the textual content or are designed for the coarse-grained Multimodal Sentiment Classification. In this paper, we propose a recurrent attention network (called SaliencyBERT) over the BERT architecture for Target-oriented Multimodal Sentiment Classification (TMSC). Specifically, we first adopt BERT and ResNet to capture the intra-modality dynamics with the textual content and the visual information respectively. Then, we design a recurrent attention mechanism, which can derive target-sensitive visual representations, to capture the inter-modality dynamics. With recurrent attention, our model can progressively optimize the alignment of target-sensitive textual features and visual features and produce an output after a fixed number of time steps. Finally, we combine the loss of all-time steps for deep supervision to prevent converging slower and overfitting. Our empirical results show that the proposed model consistently outperforms single modal methods and achieves an indistinguishable or even better performance on several highly competitive methods on two multimodal datasets from Twitter.

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision - 4th Chinese Conference, PRCV 2021, Proceedings
EditorsHuimin Ma, Liang Wang, Changshui Zhang, Fei Wu, Tieniu Tan, Yaonan Wang, Jianhuang Lai, Yao Zhao
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages13
ISBN (Print)9783030880095
StatePublished - 2021
Event4th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2021 - Beijing, China
Duration: Oct 29 2021Nov 1 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13021 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference4th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2021


  • BERT architecture
  • Recurrent attention
  • Target-oriented multimodal sentiment classification


Dive into the research topics of 'SaliencyBERT: Recurrent Attention Network for Target-Oriented Multimodal Sentiment Classification'. Together they form a unique fingerprint.

Cite this