Keyphrase Extraction Using Sequential Pattern Mining and Entropy

Qingren Wang, Victor S. Sheng, Chenyi Hu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

This paper proposes an approach KeyRank to extract high quality keyphrases from a document in English. It firstly searches all keyphrase candidates from the document, and then ranks them for selecting top-N keyphrase candidates as final keyphrases. Based on a common sense that words do not repeat-edly appear in an effective keyphrase in English, a novel keyphrase candidate search algorithm applying sequential pat-tern mining with gap constraints (called KCSP) is proposed to search keyphrase candidates for KeyRank. An effectiveness eval-uation measure pattern frequency with entropy (called PF-H) is then proposed to rank these keyphrase candidates for KeyRank. Our experimental results show that KeyRank performs better than existing popular approaches do, such as TextRank and KeyEx. Besides, KCSP is much more efficient than a closely re-lated approach SPMW, and PF-H can be applied to improve the performance of TextRank.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE International Conference on Big Knowledge, ICBK 2017
EditorsRuqian Lu, Xindong Wu, Tamer Ozsu, Xindong Wu, Jim Hendler
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages88-95
Number of pages8
ISBN (Electronic)9781538631195
DOIs
StatePublished - Aug 30 2017
Event2017 IEEE International Conference on Big Knowledge, ICBK 2017 - Hefei, China
Duration: Aug 9 2017Aug 10 2017

Publication series

NameProceedings - 2017 IEEE International Conference on Big Knowledge, ICBK 2017

Conference

Conference2017 IEEE International Conference on Big Knowledge, ICBK 2017
CountryChina
CityHefei
Period08/9/1708/10/17

Keywords

  • entropy
  • keyphrase candidate ranking
  • keyphrase candidate search
  • sequential pattern mining

Fingerprint Dive into the research topics of 'Keyphrase Extraction Using Sequential Pattern Mining and Entropy'. Together they form a unique fingerprint.

Cite this