Sirvi Autor "Purason, Taido" järgi

Nüüd näidatakse 1 - 3 3

Efficient Use of Pre-trained NMT Models Through Mixing and Matching
(Tartu Ülikool, 2023) Purason, Taido; Tättar, Andre, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituut
With an increasing amount of pre-trained language models and neural machine translation (NMT) models becoming available, it is important to investigate how to use them when training new models to avoid expensive training from scratch. This thesis investigates how to effectively use pre-trained models, focusing on combining encoders and decoders of different independent pre-trained NMT models as modules. This is not directly possible since the intermediate representations of any two independent NMT models are different and cannot be combined without modification. To get around this, firstly, a dimension adapter is added if the encoder and decoder have different embedding dimensionalities, and secondly, extra encoder layers are added after the pre-trained encoder to align the intermediate representations. As a proof of concept, this thesis looks at many-to-Estonian translation and combines a massively multilingual encoder and a high-quality language-specific decoder. The results show significant improvements in both translation quality and speed for many-to-one translation over the baseline multilingual model. Furthermore, the ability to rapidly train a high-quality NMT system is successfully demonstrated with Estonain-Ukrainian and Ukrainian-Estonian translation, achieving competitive results compared to previous works. More broadly, the thesis demonstrates that sentence representations of two independent NMT models can be made compatible without changing the pre-trained components while keeping translation quality from deteriorating.
How Well do LLMs know Finno-Ugric Languages? A Systematic Assessment
(University of Tartu Library, 2025-03) Kuulmets, Hele-Andra; Purason, Taido; Fishel, Mark; Johansson, Richard; Stymne, Sara
We present a systematic evaluation of multilingual capabilities of open large language models (LLMs), specifically focusing on five Finno-Ugric (FiU) languages. Our investigation covers multiple prompting strategies across several benchmarks and reveals that Llama-2 7B and Llama-2 13B perform weakly on most FiU languages. In contrast, Llama 3.1 models show impressive improvements, even for extremely low-resource languages such as Võro and Komi, indicating successful cross-lingual knowledge transfer inside the models. Finally, we show that stronger base models outperform weaker, language-adapted models, thus emphasizing the importance of base model in successful language adaptation.
Modular Septilingual Neural Machine Translation
(Tartu Ülikool, 2021) Purason, Taido; Tättar, Andre, juhendaja; Korotkova, Elizaveta, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituut
Currently, the majority of state-of-the-art multilingual neural machine translation systems use a single universal model which fully shares parameters between all language pairs. The University of Tartu Neural Machine Translation system uses the universal architecture as well, and thus also suffers from the problems associated with it, such as limited capacity per language pair. Previous research has shown that a modularized approach with language-specific encoders and decoders successfully addresses many of the universal model’s shortcomings. This thesis applies the modularized architecture and improves the University of Tartu translation system. Orders of magnitude larger dataset containing 7 languages is used to train the models compared to previous work. The modularized model achieves significantly higher BLEU scores than the University of Tartu model and the baseline universal model on all language pairs.