Ontological Lexicon Enrichment: The Badea System For Semi-Automated Extraction Of Antonymy Relations From Arabic Language Corpora

Main Article Content

Maha Al-Yahya
Sawsan Al-Malak
Luluh Aldhubayi

Abstract

language processing tools and applications; however, they are expensive to build, maintain, and extend. In this paper, we present the Badea system for the semi-automated extraction of lexical relations, specifically antonyms using a pattern-based approach to support the task of ontological lexicon enrichment. The approach is based on an ontology of “seed” pairs of antonyms in the Arabic language; we identify patterns in which the pairs occur and then use the patterns identified to find new antonym pairs in an Arabic textual corpora. Experiments are conducted on Badea using texts from three Arabic textual corpuses: KSUCCA, KACSTAC, and CAC. The system is evaluated and the patterns’ reliability and system performance is measured. The results from our experiments on the three Arabic corpora show that the pattern-based approach can be useful in the ontological enrichment task, as the evaluation of the system resulted in the ontology being updated with over 300 new antonym pairs, thereby enriching the lexicon and increasing its size by over 400%. Moreover, the results show important findings on the reliability of patterns in extracting antonyms for Arabic. The Badea system will facilitate the enrichment of ontological lexicons that can be very useful in any Arabic natural language processing system that requires semantic relation extraction.

Downloads

Download data is not yet available.

Article Details

How to Cite
Al-Yahya, M., Al-Malak, S., & Aldhubayi, L. (2016). Ontological Lexicon Enrichment: The Badea System For Semi-Automated Extraction Of Antonymy Relations From Arabic Language Corpora. Malaysian Journal of Computer Science, 29(1), 56–73. https://doi.org/10.22452/mjcs.vol29no1.5
Section
Articles