DETECTION OF SUSPICIOUS TERRORIST EMAILS USING TEXT CLASSIFICATION: A REVIEW

Ghulam Mujtaba; Liyana Shuib; Ram Gopal Raj; Roshan Gunalan

doi:10.22452/mjcs.vol31no4.3

FULL TEXT

Published: Oct 24, 2018

DOI: https://doi.org/10.22452/mjcs.vol31no4.3

Keywords:

Text Classification Cyber Terrorism Suspicious Emails Text Representation Feature Selection Performance Measures

Ghulam Mujtaba

Department of Information Systems, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia

Liyana Shuib

Department of Information Systems, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia

Ram Gopal Raj

Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia

Roshan Gunalan

Department of Orthopaedic Surgery, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia

Abstract

This paper provides a comprehensive review and analysis of the detection of suspicious terrorist electronic mails (emails) using various phases and methods of text classification. We explored, analyzed, and compared different datasets, features, feature extraction techniques, feature representation techniques, feature selection schemes, text classification techniques, and performance measurement metrics used in the detection of suspicious terrorist e-mails. 30 articles were retrieved from 6 well-known academic databases after rigorous selection. From the study, we found that researchers often generate their own e-mails dataset since there is no public dataset is available in the research area of detecting suspicious terrorist e-mails. In most of the studies, researchers used content and context-based features to detect terrorist e-mails. Our findings also show that the most commonly used feature extraction techniques are the bag of words and n-gram, the most typically applied feature representation schemes are binary representation and term frequency, the most usually adopted feature selection method is information gain,, the most common and most accurate text classification algorithms are naïve bayes, decision trees, and support vector machines, and the widely employed performance measurement metrics are accuracy, precision, and recall. Open research challenges and research issues that involve significant research efforts are also summarized in this review for future researchers in the area of suspicious terrorist e-mail detection using text classification techniques where the critical analysis presented in this paper also provides valuable insights to guide these researchers. Finally, the indicated issues and challenges presented in this paper can be used as future research directions in this area.

Downloads

How to Cite

Mujtaba, G., Shuib, L., Raj, R. G., & Gunalan, R. (2018). DETECTION OF SUSPICIOUS TERRORIST EMAILS USING TEXT CLASSIFICATION: A REVIEW. Malaysian Journal of Computer Science, 31(4), 271–299. https://doi.org/10.22452/mjcs.vol31no4.3

Issue

Vol. 31 No. 4 (2018): Malaysian Journal of Computer Science

Section

Articles

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

Most read articles by the same author(s)