Adaptable multi-phase rules over the infrequent class

Soma Datta, Susan Mengel

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Decision trees are a classification model that allow rule generation. Depending upon the type of decision tree model, rules may have one to hundreds of conditions and with repeating data attributes over different conditional values causing the rules to be difficult to understand. To achieve more understandable rules, the number of nodes can be minimized to control the depth of the tree and, therefore, the number of conditions in the rules. Further, the study described in this paper seeks to optimize the decision tree for the generation of rules specific to the infrequent class which presents another challenge since the infrequent class may have few instances in the dataset. Rules that are generated using either decision trees or class association mining generally come from the major class of the dataset. These two mining techniques, decision trees and association mining, are utilized together through ensemble learning in an adaptable manner so that they expand and contract to accommodate the characteristics of the dataset. The ensemble learning occurs in phases: a partially generated or minimized decision tree mining phase, and association mining phase, to increase the probability of finding infrequent class rules. The ensemble learning technique developed in this study is found to generate understandable rules with increased coverage and confidence for the infrequent class with balanced or unbalanced datasets.

Original languageEnglish
Pages (from-to)6067-6076
Number of pages10
JournalSoft Computing
Volume22
Issue number18
DOIs
StatePublished - Sep 1 2018

Keywords

  • Association mining
  • Decision tree
  • Infrequent classes
  • Large datasets
  • Multi-phase rule generation
  • Rare classes
  • Recursive partition
  • Rule sets

Fingerprint Dive into the research topics of 'Adaptable multi-phase rules over the infrequent class'. Together they form a unique fingerprint.

Cite this