Roulette sampling for cost-sensitive learning

Victor S. Sheng; Charles X. Ling

doi:10.1007/978-3-540-74958-5_73

Roulette sampling for cost-sensitive learning

Victor S. Sheng, Charles X. Ling

Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

16 Scopus citations

Abstract

In this paper, we propose a new and general preprocessor algorithm, called CSRoulette, which converts any cost-insensitive classification algorithms into cost-sensitive ones. CSRoulette is based on cost proportional roulette sampling technique (called CPRS in short). CSRoulette is closely related to Costing, another cost-sensitive meta-learning algorithm, which is based on rejection sampling. Unlike rejection sampling which produces smaller samples, CPRS can generate different size samples. To further improve its performance, we apply ensemble (bagging) on CPRS; the resulting algorithm is called CSRoulette. Our experiments show that CSRoulette outperforms Costing and other meta-learning methods in most datasets tested. In addition, we investigate the effect of various sample sizes and conclude that reduced sample sizes (as in rejection sampling) cannot be compensated by increasing the number of bagging iterations.

Original language	English
Title of host publication	Machine Learning
Subtitle of host publication	ECML 2007 - 18th European Conference on Machine Learning, Proceedings
Publisher	Springer-Verlag
Pages	724-731
Number of pages	8
ISBN (Print)	9783540749578
DOIs	https://doi.org/10.1007/978-3-540-74958-5_73
State	Published - 2007
Event	18th European Conference on Machine Learning, ECML 2007 - Warsaw, Poland Duration: Sep 17 2007 → Sep 21 2007

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	4701 LNAI
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	18th European Conference on Machine Learning, ECML 2007
Country/Territory	Poland
City	Warsaw
Period	09/17/07 → 09/21/07

Keywords

Classification
Cost-sensitive learning
Data mining
Decision trees
Machine learning
Meta-learning

Access to Document

10.1007/978-3-540-74958-5_73

Cite this

Sheng, V. S., & Ling, C. X. (2007). Roulette sampling for cost-sensitive learning. In Machine Learning: ECML 2007 - 18th European Conference on Machine Learning, Proceedings (pp. 724-731). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4701 LNAI). Springer-Verlag. https://doi.org/10.1007/978-3-540-74958-5_73

@inproceedings{541464a935104feea80fcedaef6bb56d,

title = "Roulette sampling for cost-sensitive learning",

abstract = "In this paper, we propose a new and general preprocessor algorithm, called CSRoulette, which converts any cost-insensitive classification algorithms into cost-sensitive ones. CSRoulette is based on cost proportional roulette sampling technique (called CPRS in short). CSRoulette is closely related to Costing, another cost-sensitive meta-learning algorithm, which is based on rejection sampling. Unlike rejection sampling which produces smaller samples, CPRS can generate different size samples. To further improve its performance, we apply ensemble (bagging) on CPRS; the resulting algorithm is called CSRoulette. Our experiments show that CSRoulette outperforms Costing and other meta-learning methods in most datasets tested. In addition, we investigate the effect of various sample sizes and conclude that reduced sample sizes (as in rejection sampling) cannot be compensated by increasing the number of bagging iterations.",

keywords = "Classification, Cost-sensitive learning, Data mining, Decision trees, Machine learning, Meta-learning",

author = "Sheng, {Victor S.} and Ling, {Charles X.}",

year = "2007",

doi = "10.1007/978-3-540-74958-5_73",

language = "English",

isbn = "9783540749578",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer-Verlag",

pages = "724--731",

booktitle = "Machine Learning",

note = "18th European Conference on Machine Learning, ECML 2007 ; Conference date: 17-09-2007 Through 21-09-2007",

}

Sheng, VS & Ling, CX 2007, Roulette sampling for cost-sensitive learning. in Machine Learning: ECML 2007 - 18th European Conference on Machine Learning, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4701 LNAI, Springer-Verlag, pp. 724-731, 18th European Conference on Machine Learning, ECML 2007, Warsaw, Poland, 09/17/07. https://doi.org/10.1007/978-3-540-74958-5_73

Roulette sampling for cost-sensitive learning. / Sheng, Victor S.; Ling, Charles X.
Machine Learning: ECML 2007 - 18th European Conference on Machine Learning, Proceedings. Springer-Verlag, 2007. p. 724-731 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4701 LNAI).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Roulette sampling for cost-sensitive learning

AU - Sheng, Victor S.

AU - Ling, Charles X.

PY - 2007

Y1 - 2007

N2 - In this paper, we propose a new and general preprocessor algorithm, called CSRoulette, which converts any cost-insensitive classification algorithms into cost-sensitive ones. CSRoulette is based on cost proportional roulette sampling technique (called CPRS in short). CSRoulette is closely related to Costing, another cost-sensitive meta-learning algorithm, which is based on rejection sampling. Unlike rejection sampling which produces smaller samples, CPRS can generate different size samples. To further improve its performance, we apply ensemble (bagging) on CPRS; the resulting algorithm is called CSRoulette. Our experiments show that CSRoulette outperforms Costing and other meta-learning methods in most datasets tested. In addition, we investigate the effect of various sample sizes and conclude that reduced sample sizes (as in rejection sampling) cannot be compensated by increasing the number of bagging iterations.

AB - In this paper, we propose a new and general preprocessor algorithm, called CSRoulette, which converts any cost-insensitive classification algorithms into cost-sensitive ones. CSRoulette is based on cost proportional roulette sampling technique (called CPRS in short). CSRoulette is closely related to Costing, another cost-sensitive meta-learning algorithm, which is based on rejection sampling. Unlike rejection sampling which produces smaller samples, CPRS can generate different size samples. To further improve its performance, we apply ensemble (bagging) on CPRS; the resulting algorithm is called CSRoulette. Our experiments show that CSRoulette outperforms Costing and other meta-learning methods in most datasets tested. In addition, we investigate the effect of various sample sizes and conclude that reduced sample sizes (as in rejection sampling) cannot be compensated by increasing the number of bagging iterations.

KW - Classification

KW - Cost-sensitive learning

KW - Data mining

KW - Decision trees

KW - Machine learning

KW - Meta-learning

UR - http://www.scopus.com/inward/record.url?scp=38049107996&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-74958-5_73

DO - 10.1007/978-3-540-74958-5_73

M3 - Conference contribution

AN - SCOPUS:38049107996

SN - 9783540749578

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 724

EP - 731

BT - Machine Learning

PB - Springer-Verlag

T2 - 18th European Conference on Machine Learning, ECML 2007

Y2 - 17 September 2007 through 21 September 2007

ER -

Roulette sampling for cost-sensitive learning

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this