AUTOMATIC LINE-LEVEL SCRIPT IDENTIFICATION FROM HANDWRITTEN DOCUMENT IMAGES - A REGION-WISE CLASSIFICATION FRAMEWORK FOR INDIAN SUBCONTINENT

Sk Md Obaidullah; Chayan Halder; K. C. Santosh; Nibaran Das; Kaushik Roy

doi:10.22452/mjcs.vol31no1.5

Authors

Sk Md Obaidullah Dept. of Computer Science & Engg., Aliah University, Kolkata, India
Chayan Halder Dept. of Computer Science, West Bengal State University, Kolkata, India
K. C. Santosh Dept. of Computer Science, The University of South Dakota, South Dakota, USA
Nibaran Das Dept. of Computer Science & Engg., Jadavpur University, Kolkata, India
Kaushik Roy Dept. of Computer Science & Engg., Jadavpur University, Kolkata, India

DOI:

https://doi.org/10.22452/mjcs.vol31no1.5

Keywords:

Handwritten script identification, image component fractal dimension, structural feature, directional stroke, interpolation based feature, gabor energy features, classification

Abstract

Script identification is a well-studied problem for automatic processing of document images. Several attempts have been made so far, but it is still far ahead from the complete solution. In this paper, an automatic approach for line-level handwritten script identification (HSI), considering eight official Indic scripts namely: Bangla, Devanagari, Kannada, Malayalam, Oriya, Roman, Telugu, and Urdu is proposed. We consider a 148-dimensional feature vector using: image component fractal dimension, structural and visual appearance, directional stroke, interpolation and Gabor energy based texture features. For classification, we divide the whole script dataset based on different regions of India, to study a region-wise classification performance. Experimentation was carried out using the state-of-the-art classifiers: multilayer perceptron (MLP), support vector machine (SVM), random forest (RF), and fuzzy unordered rule induction algorithm (FURIA). Among all, we found that MLP as the best performer in terms of average accuracy of 98.2%, 99.5%, 99.1%, 99.5%, 99.9%, 98%, 98.9% for eight-script, bi-script, eastern, north, south Indian script groups, scripts with ‘matra’ vs without ‘matra’, and dravidian vs. non-dravidian groups respectively.

AUTOMATIC LINE-LEVEL SCRIPT IDENTIFICATION FROM HANDWRITTEN DOCUMENT IMAGES - A REGION-WISE CLASSIFICATION FRAMEWORK FOR INDIAN SUBCONTINENT

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

Most read articles by the same author(s)

Editorial Information

Scope

Submission Guidelines

Indexing

Article Publication Charge

Journal Template

Special Issue

In Press Publication

Awards

Information

Conference

Articles

Top Cited Articles

Most View Articles

Publishing Timeline