Refining Automatically Extracted Knowledge Bases Using Crowdsourcing

Chunhua Li; Pengpeng Zhao; Victor S. Sheng; Xuefeng Xian; Jian Wu; Zhiming Cui

doi:10.1155/2017/4092135

Refining Automatically Extracted Knowledge Bases Using Crowdsourcing

Chunhua Li, Pengpeng Zhao, Victor S. Sheng, Xuefeng Xian, Jian Wu, Zhiming Cui

Computer Science

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost.

Original language	English
Article number	4092135
Journal	Computational Intelligence and Neuroscience
Volume	2017
DOIs	https://doi.org/10.1155/2017/4092135
State	Published - 2017

Access to Document

10.1155/2017/4092135

Cite this

@article{e111b12959564914b7ad92f58a636ab6,

title = "Refining Automatically Extracted Knowledge Bases Using Crowdsourcing",

abstract = "Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost.",

author = "Chunhua Li and Pengpeng Zhao and Sheng, {Victor S.} and Xuefeng Xian and Jian Wu and Zhiming Cui",

note = "Publisher Copyright: {\textcopyright} 2017 Chunhua Li et al.",

year = "2017",

doi = "10.1155/2017/4092135",

language = "English",

volume = "2017",

journal = "Computational Intelligence and Neuroscience",

issn = "1687-5265",

}

TY - JOUR

T1 - Refining Automatically Extracted Knowledge Bases Using Crowdsourcing

AU - Li, Chunhua

AU - Zhao, Pengpeng

AU - Sheng, Victor S.

AU - Xian, Xuefeng

AU - Wu, Jian

AU - Cui, Zhiming

PY - 2017

Y1 - 2017

N2 - Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost.

AB - Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost.

UR - http://www.scopus.com/inward/record.url?scp=85020037440&partnerID=8YFLogxK

U2 - 10.1155/2017/4092135

DO - 10.1155/2017/4092135

M3 - Article

C2 - 28588611

AN - SCOPUS:85020037440

SN - 1687-5265

VL - 2017

JO - Computational Intelligence and Neuroscience

JF - Computational Intelligence and Neuroscience

M1 - 4092135

ER -

Refining Automatically Extracted Knowledge Bases Using Crowdsourcing

Abstract

Access to Document

Other files and links

Fingerprint

Cite this