Angļu-latviešu statistiskās mašīntulkošanas sistēmas izveide: metodes, resursi un pirmie rezultāti
Straipsniai
Inguna Skadiņa
UL Institute of Mathematics and Computer Science image/svg+xml
Madars Virza
UL Institute of Mathematics and Computer Science image/svg+xml
Lauma Pretkalniņa
UL Institute of Mathematics and Computer Science image/svg+xml
Publikuota 2026-01-28
https://doi.org/10.15388/baltistica.0.8.2118
PDF

Kaip cituoti

Skadiņa, I., Virza, M. ir Pretkalniņa, L. (vert.) (2026) „Angļu-latviešu statistiskās mašīntulkošanas sistēmas izveide: metodes, resursi un pirmie rezultāti“, Baltistica, 47(-), p. 155–168. doi:10.15388/baltistica.0.8.2118.

Santrauka

DEVELOPMENT OF ENGLISH-LATVIAN STATISTICAL MACHINE TRANSLATION SYSTEM: METHODS, RESOURCES AND FIRST RESULTS

Summary

This paper presents research and development of English-Latvian Statistical Machine Translation (SMT) prototypes for legal domain. Several methods have been investigated, i.e., phrase-based models and factored models. Translation quality has been evaluated using automated metrics (BLEU score) and human evaluation. In automatic evaluation the best score (46.44 BLEU points) was assigned to factored model trained on JRC Ac­quis corpus (version 3.0) which was also evaluated as the best from the human viewpoint. In addition, error analysis of SMT output was performed. This analysis showed that al­though the output of the best prototype demonstrated a reasonable quality, it had several frequent common errors, i.e., incorrect form, missing words and wrong word order. For the future, work on tree-based SMT and hybrid systems is proposed.

PDF
Kūrybinių bendrijų licencija

Šis kūrinys yra platinamas pagal Kūrybinių bendrijų Priskyrimas 4.0 tarptautinę licenciją.

Atsisiuntimai

Nėra atsisiuntimų.