Lattice @MultiGEC-2025: A Spitful Multilingual Language Error Correction System Using LLaMA

dc.contributor.authorSeminck, Olga
dc.contributor.authorDupont, Yoann
dc.contributor.authorDehouck, Mathieu
dc.contributor.authorWang, Qi
dc.contributor.authorDurandard, Noé
dc.contributor.authorNovikov, Margo
dc.contributor.editorMuñoz Sánchez, Ricardo
dc.contributor.editorAlfter, David
dc.contributor.editorVolodina, Elena
dc.contributor.editorKallas, Jelena
dc.coverage.spatialTallinn, Estonia
dc.date.accessioned2025-02-17T10:35:06Z
dc.date.available2025-02-17T10:35:06Z
dc.date.issued2025-03
dc.description.abstractThis paper reports on our submission to the NLP4CALL shared task on Multilingual Grammatical Error Correction (MultiGEC-2025) (Masciolini et al., 2025). We developed two approaches: fine-tuning a large language model, LLaMA 3.0 (8B), for each MultiGEC corpus, and a pipeline based on the encoderbased language model XLM-RoBERTa. During development, the first method significantly outperformed the second, except for languages that are poorly supported by LLaMA 3.0 and have limited MultiGEC training data. Therefore, our official results for the shared task were produced using the neural network system for Slovenian, while fine-tuned LLaMA models were used for the eleven other languages. In this paper, we first introduce the shared task and its data. Next, we present our two approaches, as well as a method to detect cycles in the LLaMA output. We also discuss a number of hurdles encountered while working on the shared task.
dc.identifier.urihttps://hdl.handle.net/10062/107167
dc.language.isoen
dc.publisherUniversity of Tartu Library
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleLattice @MultiGEC-2025: A Spitful Multilingual Language Error Correction System Using LLaMA
dc.typeArticle

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
2025_nlp4call_1_2.pdf
Suurus:
266.34 KB
Formaat:
Adobe Portable Document Format