TY - GEN
T1 - Partial example acquisition in cost-sensitive learning
AU - Sheng, Victor S.
AU - Ling, Charles X.
PY - 2007
Y1 - 2007
N2 - It is often expensive to acquire data in real-world data mining applications. Most previous data mining and machine learning research, however, assumes that a fixed set of training examples is given. In this paper, we propose an online cost-sensitive framework that allows a learner to dynamically acquire examples as it learns, and to decide the ideal number of examples needed to minimize the total cost. We also propose a new strategy for Partial Example Acquisition (PAS), in which the learner can acquire examples with a subset of attribute values to reduce the data acquisition cost. Experiments on UCI datasets show that the new PAS strategy is an effective method in reducing the total cost for data acquisition.
AB - It is often expensive to acquire data in real-world data mining applications. Most previous data mining and machine learning research, however, assumes that a fixed set of training examples is given. In this paper, we propose an online cost-sensitive framework that allows a learner to dynamically acquire examples as it learns, and to decide the ideal number of examples needed to minimize the total cost. We also propose a new strategy for Partial Example Acquisition (PAS), in which the learner can acquire examples with a subset of attribute values to reduce the data acquisition cost. Experiments on UCI datasets show that the new PAS strategy is an effective method in reducing the total cost for data acquisition.
KW - Active cost-sensitive learning
KW - Active learning
KW - Cost-sensitive learning
KW - Data acquisition
KW - Data mining
KW - Induction
KW - Interactive and online data mining
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=36849044683&partnerID=8YFLogxK
U2 - 10.1145/1281192.1281261
DO - 10.1145/1281192.1281261
M3 - Conference contribution
AN - SCOPUS:36849044683
SN - 1595936092
SN - 9781595936097
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 638
EP - 646
BT - KDD-2007
T2 - KDD-2007: 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Y2 - 12 August 2007 through 15 August 2007
ER -