Abstract—This paper describes the principles used to collect
open English and Japanese Twitter corpora for emotion analysis.
We have created a set of eight emotions, based on Ekman and
Plutchik categories, applicable both to the English-speaking and
Japanese cultures, ensuring that each tweet in our subset of
TREC’2011 collection is coded independently by three
individuals. We analyse emotions contained in the resulting
corpora and briefly discuss the obtained results. This work will
provide valuable insights for researchers interested in emotion
analysis of micro-blogosphere and comparative studies of
English and Japanese tweets.
Index Terms—Emotion, corpus, microblogs, Twitter.
A. Danielewicz-Betz, H. Kaneda, and M. Mozgovoy are with School of
Computer Science and Engineering, the University of Aizu, Japan (e-mail:
abetz@u-aizu.ac.jp, mozgovoy@u-aizu.ac.jp, mapurgina@gmail.com).
M. Purgina is with the Department of Computer Systems and Software
Engineering, St. Petersburg State Polytechnic University, Russia (e-mail:
rainbowdash7777@gmail.com).
[PDF]
Cite: A. Danielewicz-Betz, H. Kaneda, M. Mozgovoy, and M. Purgina, "Creating English and Japanese Twitter Corpora for Emotion Analysis," International Journal of Knowledge Engineering vol. 1, no. 2, pp. 120-124, 2015.