An empirical comparison of four text mining methods

Sangno Lee, Jeff Baker, Jaeki Song, James C. Wetherbe

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

42 Scopus citations

Abstract

The amount of textual data that is available for researchers and businesses to analyze is increasing at a dramatic rate. This reality has led IS researchers to investigate various text mining techniques. This essay examines four text mining methods that are frequently used in order to identify their advantages and limitations. The four methods that we examine are (1) latent semantic analysis, (2) probabilistic latent semantic analysis, (3) latent Dirichlet allocation, and (4) the correlated topic model. We compare these four methods and highlight the optimal conditions under which to apply the various methods. Our paper sheds light on the theory that underlies text mining methods and provides guidance for researchers who seek to apply these methods.

Original languageEnglish
Title of host publicationProceedings of the 43rd Annual Hawaii International Conference on System Sciences, HICSS-43
PublisherIEEE Computer Society
ISBN (Print)9780769538693
DOIs
StatePublished - 2010
Event43rd Annual Hawaii International Conference on System Sciences, HICSS-43 - Koloa, Kauai, HI, United States
Duration: Jan 5 2010Jan 8 2010

Publication series

NameProceedings of the Annual Hawaii International Conference on System Sciences
ISSN (Print)1530-1605

Conference

Conference43rd Annual Hawaii International Conference on System Sciences, HICSS-43
Country/TerritoryUnited States
CityKoloa, Kauai, HI
Period01/5/1001/8/10

Fingerprint

Dive into the research topics of 'An empirical comparison of four text mining methods'. Together they form a unique fingerprint.

Cite this