TY - JOUR
T1 - Multi-label Image Classification via Coarse-to-Fine Attention
AU - Lyu, Fan
AU - Li, Linyan
AU - Sheng, Victor S.
AU - Fu, Qiming
AU - Hu, Fuyuan
N1 - Publisher Copyright:
© 2019 Chinese Institute of Electronics.
PY - 2019/11/10
Y1 - 2019/11/10
N2 - Great efforts have been made by using deep neural networks to recognize multi-label images. Since multi-label image classification is very complicated, many studies seek to use the attention mechanism as a kind of guidance. Conventional attention-based methods always analyzed images directly and aggressively, which is difficult to well understand complicated scenes. We propose a global/local attention method that can recognize a multi-label image from coarse to fine by mimicking how human-beings observe images. Our global/local attention method first concentrates on the whole image, and then focuses on its local specific objects. We also propose a joint max-margin objective function, which enforces that the minimum score of positive labels should be larger than the maximum score of negative labels horizontally and vertically. This function further improve our multi-label image classification method. We evaluate the effectiveness of our method on two popular multi-label image datasets (i.e., Pascal VOC and MS-COCO). Our experimental results show that our method outperforms state-of-the-art methods.
AB - Great efforts have been made by using deep neural networks to recognize multi-label images. Since multi-label image classification is very complicated, many studies seek to use the attention mechanism as a kind of guidance. Conventional attention-based methods always analyzed images directly and aggressively, which is difficult to well understand complicated scenes. We propose a global/local attention method that can recognize a multi-label image from coarse to fine by mimicking how human-beings observe images. Our global/local attention method first concentrates on the whole image, and then focuses on its local specific objects. We also propose a joint max-margin objective function, which enforces that the minimum score of positive labels should be larger than the maximum score of negative labels horizontally and vertically. This function further improve our multi-label image classification method. We evaluate the effectiveness of our method on two popular multi-label image datasets (i.e., Pascal VOC and MS-COCO). Our experimental results show that our method outperforms state-of-the-art methods.
KW - Attention
KW - Convolutional neural network
KW - Multi-label classification
KW - Recurrent neural network
UR - http://www.scopus.com/inward/record.url?scp=85075680207&partnerID=8YFLogxK
U2 - 10.1049/cje.2019.07.015
DO - 10.1049/cje.2019.07.015
M3 - Article
AN - SCOPUS:85075680207
SN - 1022-4653
VL - 28
SP - 1118
EP - 1126
JO - Chinese Journal of Electronics
JF - Chinese Journal of Electronics
IS - 6
ER -