Phishing URL Detection Using URL Ranking

Mohammed Nazim Feroz, Susan Mengel

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

62 Scopus citations

Abstract

The openness of the Web exposes opportunities for criminals to upload malicious content. In fact, despite extensive research, email based spam filtering techniques are unable to protect other web services. Therefore, a counter measure must be taken that generalizes across web services to protect the user from phishing host URLs. This paper describes an approach that classifies URLs automatically based on their lexical and host-based features. Clustering is performed on the entire dataset and a cluster ID (or label) is derived for each URL, which in turn is used as a predictive feature by the classification system. Online URL reputation services are used in order to categorize URLs and the categories returned are used as a supplemental source of information that would enable the system to rank URLs. The classifier achieves 93-98% accuracy by detecting a large number of phishing hosts, while maintaining a modest false positive rate. URL clustering, URL classification, and URL categorization mechanisms work in conjunction to give URLs a rank.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE International Congress on Big Data, BigData Congress 2015
EditorsLatifur Khan, Carminati Barbara
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages635-638
Number of pages4
ISBN (Electronic)9781467372787
DOIs
StatePublished - Aug 17 2015
Event4th IEEE International Congress on Big Data, BigData Congress 2015 - New York City, United States
Duration: Jun 27 2015Jul 2 2015

Publication series

NameProceedings - 2015 IEEE International Congress on Big Data, BigData Congress 2015

Conference

Conference4th IEEE International Congress on Big Data, BigData Congress 2015
Country/TerritoryUnited States
CityNew York City
Period06/27/1507/2/15

Keywords

  • Classification
  • Clustering
  • Feature Vector
  • URL Ranking
  • Web Categorization

Fingerprint

Dive into the research topics of 'Phishing URL Detection Using URL Ranking'. Together they form a unique fingerprint.

Cite this