Abstract—In this paper, we propose a method to
automatically discover links between valuable keyphrases in a
Japanese document and corresponding Chinese encyclopedia
pages. The proposed method has three stages. First, we translate
Japanese keyphrases into Chinese using a combination of three
translation methods. Second, we extract all Chinese
encyclopedia articles of the translated keyphrases. Third, we
translate the original Japanese document into Chinese and make
a vector of noun frequencies. We calculate the cosine similarities
of original articles and all candidate Chinese encyclopedia ones.
To find the appropriateness of term description pages for
disambiguation, we make a rank with cosine similarity by
comparing a Japanese document with Chinese encyclopedia
articles. Finally, we add a link from a Japanese keyphrase to
top-ranking Chinese encyclopedia article. In this paper, we use
Wikipedia and Baidu Baike (an online encyclopedia published
by Baidu, a Chinese search engine) articles to conduct our
experiment. Although we achieved an accuracy rate of 81% by
using Wikipedia, we achieved an accuracy rate of 97% by using
Baidu Baike.
Index Terms—Encyclopedia, cross-language link discovery,
Wikification, Baidu Baike.
Xiang Song and Jialiang Zhou are with the Graduate School of
Information Science and Engineering, Ritsumeikan University, Shiga, Japan
(e-mail: gr0187xx@ed.ritsumei.ac.jp, is0095hx@ed.ritsumei.ac.jp).
Fuminori Kimura is with the Faculty of Economics Management and
Information Science, Onomichi City University, Hiroshima, Japan (e-mail:
f-kimura@onomichi-u.ac.jp).
Akira Maeda is with the College of Information Science and Engineering,
Ritsumeikan University, Shiga, Japan (e-mail: amaeda@is.ritsumei.ac.jp).
[PDF]
Cite: Xiang Song, Jialiang Zhou, Fuminori Kimura, and Akira Maeda, "A Japanese-Chinese Cross-Language Entity Linking Method with Entity Disambiguation Based on Document Similarity," International Journal of Knowledge Engineering vol. 2, no. 3, pp. 122-127, 2016.