Abstract—Previously, there were morphological analyzer
and lemmatization method for Bahasa: Indonesian language,
yet they have not handled all occurred cases. Therefore, we
develop an algorithm which combines two tasks; they are to
generate affixed words from a root word and vice versa. The
current morphological analyzer to generate affixed words has
not covered in analyzing two words, whilst the current
lemmatization method cannot find out the lemma from an
affixed word which has confix and reduplication. Hence, we will
cover these issues in order to enhance the current methods. The
algorithm concerns only in Bahasa. The algorithm to generate
affixed word is based on the two-level morphological analyzer,
while refinement of lemmatization method is based on rule
precedence and token checking. After implementing the
algorithms, we find out that affixed word produced is 12.63%
productive words, 86.98% non-productive words, and 0.39%
incorrect words for the affixed word, whilst lemmatization can
achieve 96.11% accuracy.
Index Terms—Affixed word, root word, Bahasa,
morphological analyzer, lemmatization.
A. B. Oktarino, D. T. Winahyu, and A. Halim were with Bina Nusantara
University, Indonesia.
D. Suhartono is with Bina Nusantara University, Jakarta, Indonesia
(e-mail: dsuhartono@binus.edu).
[PDF]
Cite: Andri Budiman Oktarino, Dwi Taruna Winahyu, Andrew Halim, and Derwin Suhartono, "Generating Affixed Words from a Root Word and Getting Lemma from Affixed Word in Bahasa: Indonesian Language," International Journal of Knowledge Engineering vol. 2, no. 3, pp. 132-136, 2016.