Homophily-aware correction framework for crowdsourced labels using heterogeneous information network

Qingren Wang, Jian Lu, Wei Li, Jing Zhang, Victor S. Sheng

Research output: Contribution to journalArticlepeer-review

Abstract

Crowdsourcing provides a cost-effective and convenient way for label collection, but it fails to guarantee the quality of crowdsourced labels. On the one hand, it is impossible to obtain accurate and detailed information of labelers that participate in tasks because of the anonymous nature of crowdsourcing. On the other hand, most of existing methods focus on characteristics of individuals while ignoring the explicit and implicit interactive information among individuals. Besides, existing homogeneous information network-based approaches cannot distinguish the heterogeneity among labelers as well as their corresponding relationships, which results in irreversible loss of interactive information among labelers. Enormous observations show that labelers often provide the same (different) answers for tasks belonging to the same category if they have highly similar (contrary) views. Therefore, in this paper, we first define this kind of similar (contrary) views over the same task category as homophily among labelers. And then we propose a novel Homophily-aware Correction Framework (HaCF) based on heterogeneous information network to model multiple explicit and implicit interactive relations among labelers, tasks, and categories. In addition, we propose a novel homophily-based label classifier to strengthen the impact of positive labels while reducing the influence of negative ones. Experimental results on seven real-world datasets not only show the effectiveness of our HaCF in terms of quality improvement of crowdsourced labels but also demonstrate the expandability in terms of collaborating with inference algorithms.

Original languageEnglish
Article number116896
JournalExpert Systems with Applications
Volume200
DOIs
StatePublished - Aug 15 2022

Keywords

  • Crowdsourcing
  • Heterogeneous information network
  • Homophily
  • Inference algorithm
  • Label correction

Fingerprint

Dive into the research topics of 'Homophily-aware correction framework for crowdsourced labels using heterogeneous information network'. Together they form a unique fingerprint.

Cite this