Documents
Presentation Slides
Presentation Slides
Proper Noun Recognition in Cross-Language Record Linkage by Exploiting Transliterated Words
- Citation Author(s):
- Submitted by:
- Yuting Song
- Last updated:
- 22 November 2016 - 11:40am
- Document Type:
- Presentation Slides
- Document Year:
- 2016
- Event:
- Presenters:
- Yuting Song
- Paper Code:
- 93
- Categories:
- Log in to post comments
Proper nouns in metadata are representative features for linking the identical records across data sources in different languages. In order to improve the accuracy of proper noun recognition, we propose a back-transliteration method, in which transliterated words in target language are back-transliterated to their original words in source language. The acquired words and their transliterations are employed to recognize and transliterate proper nouns in metadata. Experimental results show the usage of the bilingual words that we have obtained can improve the accuracy of cross-language record linkage.