TY - JOUR
T1 - Label noise correction and application in crowdsourcing
AU - Nicholson, Bryce
AU - Sheng, Victor S.
AU - Zhang, Jing
N1 - Funding Information:
We thank the anonymous reviewers for the valuable comments. This research has been supported by the U.S. National Science Foundation under Grant No. IIS-1115417 , the National Natural Science Foundation of China under Grant No. 61603186 , the Natural Science Foundation of Jiangsu Province, China , under Grant No. BK20160843 , and the China Postdoctoral Science Foundation under Grant No. 2016M590457 .
Publisher Copyright:
© 2016 Elsevier Ltd
PY - 2016/12/30
Y1 - 2016/12/30
N2 - The important task of correcting label noise is addressed infrequently in literature. The difficulty of developing a robust label correction algorithm leads to this silence concerning label correction. To break the silence, we propose two algorithms to correct label noise. One utilizes self-training to re-label noise, called Self-Training Correction (STC). Another is a clustering-based method, which groups instances together to infer their ground-truth labels, called Cluster-based Correction (CC). We also adapt an algorithm from previous work, a consensus-based method called Polishing that consults with an ensemble of classifiers to change the values of attributes and labels. We simplify Polishing such that it only alters labels of instances, and call it Polishing Labels (PL). We experimentally compare our novel methods with Polishing Labels by examining their improvements on the label qualities, model qualities, and AUC metrics of binary and multi-class data sets under different noise levels. Our experimental results demonstrate that CC significantly improves label qualities, model qualities, and AUC metrics consistently. We further investigate how these three noise correction algorithms improve the data quality, in terms of label accuracy, in the context of image labeling in crowdsourcing. First, we look at three consensus methods for inferring a ground-truth label from the multiple noisy labels obtained from crowdsourcing, i.e., Majority Voting (MV), Dawid Skene (DS), and KOS. We then apply the three noise correction methods to correct labels inferred by these consensus methods. Our experimental results show that the noise correction methods improve the labeling quality significantly. As an overall result of our experiments, we conclude that CC performs the best. Our research has illustrated the viability of implementing noise correction as another line of defense against labeling error, especially in a crowdsourcing setting. Furthermore, it presents the feasibility of the automation of an otherwise manual process of analyzing a data set, and correcting and cleaning the instances, an expensive and time-consuming task.
AB - The important task of correcting label noise is addressed infrequently in literature. The difficulty of developing a robust label correction algorithm leads to this silence concerning label correction. To break the silence, we propose two algorithms to correct label noise. One utilizes self-training to re-label noise, called Self-Training Correction (STC). Another is a clustering-based method, which groups instances together to infer their ground-truth labels, called Cluster-based Correction (CC). We also adapt an algorithm from previous work, a consensus-based method called Polishing that consults with an ensemble of classifiers to change the values of attributes and labels. We simplify Polishing such that it only alters labels of instances, and call it Polishing Labels (PL). We experimentally compare our novel methods with Polishing Labels by examining their improvements on the label qualities, model qualities, and AUC metrics of binary and multi-class data sets under different noise levels. Our experimental results demonstrate that CC significantly improves label qualities, model qualities, and AUC metrics consistently. We further investigate how these three noise correction algorithms improve the data quality, in terms of label accuracy, in the context of image labeling in crowdsourcing. First, we look at three consensus methods for inferring a ground-truth label from the multiple noisy labels obtained from crowdsourcing, i.e., Majority Voting (MV), Dawid Skene (DS), and KOS. We then apply the three noise correction methods to correct labels inferred by these consensus methods. Our experimental results show that the noise correction methods improve the labeling quality significantly. As an overall result of our experiments, we conclude that CC performs the best. Our research has illustrated the viability of implementing noise correction as another line of defense against labeling error, especially in a crowdsourcing setting. Furthermore, it presents the feasibility of the automation of an otherwise manual process of analyzing a data set, and correcting and cleaning the instances, an expensive and time-consuming task.
KW - Classification
KW - Crowdsourcing
KW - Image processing
KW - Noise correction
UR - http://www.scopus.com/inward/record.url?scp=84988000713&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2016.09.003
DO - 10.1016/j.eswa.2016.09.003
M3 - Article
AN - SCOPUS:84988000713
VL - 66
SP - 149
EP - 162
JO - Expert Systems with Applications
JF - Expert Systems with Applications
SN - 0957-4174
ER -