Abstract
In practice, some attributes meet a unique constraint: each entity has a unique value for the attribute. A deep web entity identification method was presented to solve problems of data error correction, uniqueness constraint enforcement, and local data fusion in deep web data integration. The method transformed the entity identification phrase to a k-partite graph clustering problem, considering both similarity and association of attribute values. Moreover, it performed global record linkage and data fusion simultaneously and could identify incorrect values and differentiate them from correct ones at the beginning. Experimental results demonstrate the high precision and scalability of our method.
Original language | English |
---|---|
Pages (from-to) | 2470-2482 |
Number of pages | 13 |
Journal | International Journal of Performability Engineering |
Volume | 14 |
Issue number | 10 |
DOIs | |
State | Published - Oct 2018 |
Keywords
- Clustering
- Data fusion
- Entity identification
- K-partite graph
- Match
- Record linkage