


























Portuguese serves as the official language of multiple countries across four continents. It is classified into two primary variants (European Portuguese and Brazilian Portuguese), but there is limited research on and resources for European Portuguese compared to the Brazilian variant.In this paper, we consider the task of Machine Translation (MT) into Portuguese. Given the resource imbalance, standard MT systems produce translations that are typically closer to the Brazilian standard. We compare four methods available to bias the translation toward the minority European Portuguese variant that target different places in the MT lifecycle: (1) reranking n-best MT outputs according to a variant classifier; (2) biasing hypothesis generation at inference time toward the target variant; (3) fine-tuning for the target variants; (4) moving completely to an LLM-based approach. We find that all methods can bias translation outputs to an extent. The LLM-based approach yields numerically the highest results, but the impact of memorisation remains unclear.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。