Angļu-latviešu statistiskās mašīntulkošanas sistēmas izveide: metodes, resursi un pirmie rezultāti

Inguna Skadiņa; Madars Virza; Lauma Pretkalniņa

doi:10.15388/baltistica.0.8.2118

Articles

Inguna Skadiņa

UL Institute of Mathematics and Computer Science

Madars Virza

UL Institute of Mathematics and Computer Science

Lauma Pretkalniņa

UL Institute of Mathematics and Computer Science

Published 2012-09-01

https://doi.org/10.15388/baltistica.0.8.2118

PDF

Keywords

statistical machine translation
Latvian
English
computer linguistics

How to Cite

Skadiņa, I., Virza, M. and Pretkalniņa, L. (2012) “Angļu-latviešu statistiskās mašīntulkošanas sistēmas izveide: metodes, resursi un pirmie rezultāti”, Baltistica, 47(-), pp. 155–168. doi:10.15388/baltistica.0.8.2118.

Download Citation

Abstract

DEVELOPMENT OF ENGLISH-LATVIAN STATISTICAL MACHINE TRANSLATION SYSTEM: METHODS, RESOURCES AND FIRST RESULTS

Summary

This paper presents research and development of English-Latvian Statistical Machine Translation (SMT) prototypes for legal domain. Several methods have been investigated, i.e., phrase-based models and factored models. Translation quality has been evaluated using automated metrics (BLEU score) and human evaluation. In automatic evaluation the best score (46.44 BLEU points) was assigned to factored model trained on JRC Acquis corpus (version 3.0) which was also evaluated as the best from the human viewpoint. In addition, error analysis of SMT output was performed. This analysis showed that although the output of the best prototype demonstrated a reasonable quality, it had several frequent common errors, i.e., incorrect form, missing words and wrong word order. For the future, work on tree-based SMT and hybrid systems is proposed.

PDF

References

This work is licensed under a Creative Commons Attribution 4.0 International License.

Downloads

Download data is not yet available.