Named Entity Recognition on Indonesian Microblog Messages

Citation Author(s):
Natanael Taufik, Alfan F. Wicaksono, Mirna Adriani
Submitted by:
Natanael Taufik
Last updated:
22 November 2016 - 7:42am
Document Type:
Presentation Slides
Document Year:
Presenters Name:
Natanael Taufik
Paper Code:



This paper describes a model to address the task of named-entity recognition on Indonesian microblog messages due to its usefulness for higher-level tasks or text mining applications on Indonesian microblogs. We view our task as a sequence labeling problem using machine learning approach. We also propose various word-level and orthographic features, including the ones that are specific to the Indonesian language. Finally, in our experiment, we compared our model with a baseline model previously proposed for Indonesian formal documents, instead of microblog messages. Our contribution is two-fold: (1) we developed NER tool for Indonesian microblog messages, which was never addressed before, (2) we developed NER corpus containing around 600 Indonesian microblog messages available for future development.

Dataset Files

IALP2016 - Named Entity Recognition on Indonesian Microblog Messages.pdf