HOME JOURNALS CONTACT

Information Technology Journal

Year: 2005 | Volume: 4 | Issue: 1 | Page No.: 38-43
DOI: 10.3923/itj.2005.38.43
Pattern-based Stemmer for Finding Arabic Roots
Riyad Alshalabi

Abstract: This study provides a technique for extracting the triliteral Arabic root for an unvocalized Arabic corpus. It provides an efficient way to remove suffixes and prefixes from the inflected words. Then it matches the resulting word with the available patterns to find the suitable one and then extracts the three letters of the root by removing all infixes in that pattern. This technique does not use any dictionary to check the resulting stem. We define some rules that help to decide if the letters belong to the root or not. This algorithm has been tested on a corpus of 72 abstracts (10582 words) from the Saudi Arabian National Computer Conference, the accuracy of this algorithm is about 92%.

Fulltext PDF Fulltext HTML

How to cite this article
Riyad Alshalabi , 2005. Pattern-based Stemmer for Finding Arabic Roots. Information Technology Journal, 4: 38-43.

Related Articles:
© Science Alert. All Rights Reserved