Ensemble Learning from Crowds

Jing Zhang, Ming Wu, Victor S. Sheng

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


Traditional learning from crowdsourced labeled data consists of two stages: inferring true labels for instances from their multiple noisy labels and building a learning model using these instances with the inferred labels. This straightforward two-stage learning scheme suffers from two weaknesses: (1) the accuracy of inference may be very low; (2) useful information may be lost during inference. In this paper, we proposed a novel ensemble method for learning from crowds. Our proposed method is a meta-learning scheme. It first uses a bootstrapping process to create MM sub-datasets from an original crowdsourced labeled dataset. For each sub-dataset, each instance is duplicated with different weights according to the distribution and class memberships of its multiple noisy labels. A base classifier is then trained from this extended sub-dataset. Finally, unlabeled instances are predicted by aggregating the outputs of these MM base classifiers. Because the proposed method gets rid of the inference procedure and uses the full dataset to train learning models, it preserves the useful information for learning as much as possible. Experimental results on nine simulated and two real-world crowdsourcing datasets consistently show that the proposed ensemble learning method significantly outperforms five state-of-the-art methods.

Original languageEnglish
Article number8423205
Pages (from-to)1506-1519
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number8
StatePublished - Aug 1 2019


  • Bagging
  • Classification
  • Crowdsourcing
  • Ensemble learning
  • Learning from crowds


Dive into the research topics of 'Ensemble Learning from Crowds'. Together they form a unique fingerprint.

Cite this