Welcome to IASL 2016 - November 21-23, 2016, Tainan, Taiwan
The International Conference on Asian Language Processing (IALP) is a series of conferences with unique focus on Asian Language Processing. The conference aims to advance the science and technology of all the aspects of Asian Language Processing by providing a forum for researchers in the different fields of language study all over the world to meet.
- Read more about History Question Classification and Representation for Chinese Gaokao
- Log in to post comments
In this paper, we propose a question representation based on entity labeling and question classification for a automatic question answering system of Chinese Gaokao history question. A CRF model is used for the entity labeling and SVM/CNN/LSTM models are tested for question classification. Our experiments show that CRF model provides a high performance when used to label informative entities out while neural networks has a promising performance for the question classification task.
- Categories:
- Read more about The Effect of Shallow Segmentation for English-Tigrinya Statistical Machine Translation
- Log in to post comments
This paper presents initial English-Tigrinya statistical machine translation (SMT) research. Tigrinya is a highly inflected Semitic language spoken in Eritrea and Ethiopia. Translation involving morphologically complex languages is challenged by factors including data sparseness and source-target word alignment. We try to address these problems through morphological segmentation of Tigrinya words. After segmentation the difference in token count dropped significantly from 37.7% to 0.1%. The out-of-vocabulary rate was reduced by 46%.
IALP-tig.pdf
- Categories:
We present a word sense disambiguation (WSD) tool of Japanese Hiragana words. Unlike other WSD tasks which output something like “sense #3” as result, our WSD task rewrites the target word into a Kanji word, which is a different orthography. This means that the task is also a kind of orthographical normalization as well as WSD. In this paper we present the task, our method, and the performance.
IALP-wsd.pdf
- Categories:
- Read more about Detecting Representative Web Articles Using Heterogeneous Graphs
- Log in to post comments
- Categories:
- Read more about Annotation Schemes for Constructing Uyghur Named Entity Relation Corpus
- Log in to post comments
Uyghur is minority language in China, it is one of the official languages in Xinjiang Uyghur Autonomous Region of China. More than 10 million people use Uyghur in their daily life and even on the Internet. However, lack of Uyghur entity relation corpus constrains relation extraction applications in Uyghur. In this paper, we describe annotation schemes for creating annotated corpus for Uyghur named entity and Uyghur named entity relation.
- Categories:
- Read more about Construction of the Basic Sentence-pattern Instance Database Based on the International Chinese Textbook Treebank
- Log in to post comments
- Categories:
- Read more about Japanese Orthographical Normalization Does Not Work for Statistical Machine Translation
- Log in to post comments
We have investigated the effect of normalizing Japanese orthographical variants into a uniform orthography on statistical machine translation (SMT) between Japanese and English. In Japanese, 10% of words have reportedly more than one orthographical variants, which is a promising fact for improving translation quality when we normalize these orthographical variants.
- Categories:
- Read more about Fundamental Tools and Resource are Available for Vietnamese Analysis
- Log in to post comments
This paper presents our work on developing Vietnamese fundamental tools and a resource for analysis. These tools are for word segmentation and part-of-speech tagging, diacritics restoration, and orthographical variants dictionary. All of them have been either not publicly available so far or not attaining sufficient performance. We have developed the tools and released the tools to the public, in both software packages and web tools. For development, we utilize state-of-the-art methods and achieved high accuracy.
- Categories:
- Read more about Valence-Arousal Ratings Prediction with Co-occurrence Word-embedding
- Log in to post comments
Sentiment analysis draws increasing attention of researchers in wide-ranging fields. Compared with the commonly-used categorical
- Categories:
- Read more about Mongolian Prosodic Phrase Prediction using Suffix Segmentation
- Log in to post comments
Accurate prosodic phrase prediction can improve
the naturalness of speech synthesis. Predicting the prosodic
phrase can be regarded as a sequence labeling problem and
the Conditional Random Field (CRF) is typically used to
solve it. Mongolian is an agglutinative language, in which
massive words can be formed by concatenating these stems
and suffixes. This character makes it difficult to build a
Mongolian prosodic phrase predictions system, based on
CRF, that has high performance. We introduce a new
- Categories: