Fast data acquisition in cost-sensitive learning

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Data acquisition is the first and one of the most important steps in many data mining applications. It is a time consuming and costly task. Acquiring an insufficient number of examples makes the learned model and future prediction inaccurate, while acquiring more examples than necessary wastes time and money. Thus it is very important to estimate the number examples needed for learning algorithms in machine learning. However, most previous learning algorithms learn from a given and fixed set of examples. To our knowledge, little previous work in machine learning can dynamically acquire examples as it learns, and decide the ideal number of examples needed. In this paper, we propose a simple on-line framework for fast data acquisition (FDA). FDA is an extrapolation method that estimates the number of examples needed in each acquisition and acquire them simultaneously. Comparing to the naïve step-by-step data acquisition strategy, FDA reduces significantly the number of times of data acquisition and model building. This would significantly reduce the total cost of misclassification, data acquisition arrangement, computation, and examples acquired costs.

Original languageEnglish
Title of host publicationAdvances in Data Mining
Subtitle of host publicationApplications and Theoretical Aspects - 11th Industrial Conference, ICDM 2011, Proceedings
Pages66-77
Number of pages12
DOIs
StatePublished - 2011
Event11th Industrial Conference on Data Mining, ICDM 2011 - New York, NY, United States
Duration: Aug 30 2011Sep 3 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6870 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th Industrial Conference on Data Mining, ICDM 2011
Country/TerritoryUnited States
CityNew York, NY
Period08/30/1109/3/11

Keywords

  • cost-sensitive learning
  • data acquisition
  • data mining
  • fast data acquisition
  • machine learning

Fingerprint

Dive into the research topics of 'Fast data acquisition in cost-sensitive learning'. Together they form a unique fingerprint.

Cite this