Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Japanese Orthographical Normalization Does Not Work for Statistical Machine Translation

Abstract: 

We have investigated the effect of normalizing Japanese orthographical variants into a uniform orthography on statistical machine translation (SMT) between Japanese and English. In Japanese, 10% of words have reportedly more than one orthographical variants, which is a promising fact for improving translation quality when we normalize these orthographical variants. However, the results show that SMT with normalization is equivalent to that without normalization by both BLEU and RIBES measurement, even though normalization reduces the size of language models, its perplexity, and the number of out-of-vocabulary words. We discuss the potential reasons in this paper.

up
0 users have voted:

Paper Details

Authors:
Kazuhide Yamamoto, Kanji Takahashi
Submitted On:
21 November 2016 - 8:28pm
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Kanji Takahashi
Paper Code:
15
Document Year:
2016
Cite

Document Files

15-IALP2016.pdf

(292)

Subscribe

[1] Kazuhide Yamamoto, Kanji Takahashi, "Japanese Orthographical Normalization Does Not Work for Statistical Machine Translation", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1273. Accessed: Nov. 18, 2019.
@article{1273-16,
url = {http://sigport.org/1273},
author = {Kazuhide Yamamoto; Kanji Takahashi },
publisher = {IEEE SigPort},
title = {Japanese Orthographical Normalization Does Not Work for Statistical Machine Translation},
year = {2016} }
TY - EJOUR
T1 - Japanese Orthographical Normalization Does Not Work for Statistical Machine Translation
AU - Kazuhide Yamamoto; Kanji Takahashi
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1273
ER -
Kazuhide Yamamoto, Kanji Takahashi. (2016). Japanese Orthographical Normalization Does Not Work for Statistical Machine Translation. IEEE SigPort. http://sigport.org/1273
Kazuhide Yamamoto, Kanji Takahashi, 2016. Japanese Orthographical Normalization Does Not Work for Statistical Machine Translation. Available at: http://sigport.org/1273.
Kazuhide Yamamoto, Kanji Takahashi. (2016). "Japanese Orthographical Normalization Does Not Work for Statistical Machine Translation." Web.
1. Kazuhide Yamamoto, Kanji Takahashi. Japanese Orthographical Normalization Does Not Work for Statistical Machine Translation [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1273