Proceedings of the 14th Workshop on Natural Language Processing for Computer Assisted Language Learning

Selle kollektsiooni püsiv URIhttps://hdl.handle.net/10062/107164

Sirvi

Viimati lisatud

Nüüd näidatakse 1 - 9 9
  • Kirje
    14th Workshop on Natural Language Processing for Computer Assisted Language Learning (NLP4CALL 2025)
    (University of Tartu Library, 2025-03) Muñoz Sánchez, Ricardo; Alfter, David; Volodina, Elena; Kallas, Jelena
  • Kirje
    Investigating Linguistic Abilities of LLMs for Native Language Identification
    (University of Tartu Library, 2025-03) Uluslu, Ahmet Yavuz; Schneider, Gerold; Muñoz Sánchez, Ricardo; Alfter, David; Volodina, Elena; Kallas, Jelena
    Large language models (LLMs) have achieved state-of-the-art results in native language identification (NLI). However, these models often depend on superficial features, such as cultural references and self-disclosed information in the document, rather than capturing the underlying linguistic structures. In this work, we evaluate the linguistic abilities of opensource LLMs by evaluating their performance in NLI through content-independent features, such as POS n-grams, function words, and punctuation marks, and compare their performance against traditional machine learning approaches. Our experiments reveal that while LLM’s initial performance on structural features (55.2% accuracy) falls significantly below their performance on full text (96.5%), fine-tuning significantly improves their capabilities, enabling state-of-the-art results with strong cross-domain generalization.
  • Kirje
    PIRLS Category-specific Question Generation for Reading Comprehension
    (University of Tartu Library, 2025-03) Poon, Yin; Wang, Qiong; Lee, John S. Y.; Lam, Yu Yan; Kai Wah Chu, Samuel; Muñoz Sánchez, Ricardo; Alfter, David; Volodina, Elena; Kallas, Jelena
    According to the internationally recognized PIRLS (Progress in International Reading Literacy Study) assessment standards, reading comprehension questions should encompass all four comprehension processes: retrieval, inferencing, integrating and evaluation. This paper investigates whether Large Language Models can produce high-quality questions for each of these categories. Human assessment on a Chinese dataset shows that GPT-4o can generate usable and category-specific questions, ranging from 74% to 90% accuracy depending on the category.
  • Kirje
    A prototype authoring tool for editing authentic texts using LLMs to increase support for contextualised L2 grammar practice
    (University of Tartu Library, 2025-03) Bodnar, Stephen; Muñoz Sánchez, Ricardo; Alfter, David; Volodina, Elena; Kallas, Jelena
    ICALL systems that offer grammar exercises with authentic texts have the potential to motivate learners, but finding suitable documents can be problematic because of the low number of target grammar forms they typically contain. Meanwhile, research showing the ability of Large Language Models (LLMs) to rewrite texts in controlled ways is emerging, and this begs the question of whether or not they can be used to modify authentic L2 texts to increase their suitability for grammar learning. In this paper we present a tool we have developed to explore this idea. The authoring tool employs a lexical database to create prompts that instruct an LLM to insert specific target forms into the text. We share our plans to evaluate the quality of the automatically modified texts based on human judgments from native speakers.
  • Kirje
    Interpretable Machine Learning for Societal Language Identification: Modeling English and German Influences on Portuguese Heritage Language
    (University of Tartu Library, 2025-03) Akef, Soroosh; Meurers, Detmar; Mendes, Amália; Rebuschat, Patrick; Muñoz Sánchez, Ricardo; Alfter, David; Volodina, Elena; Kallas, Jelena
    This study leverages interpretable machine learning to investigate how different societal languages (SLs) influence the written production of Portuguese heritage language (HL) learners. Using a corpus of learner texts from adolescents in Germany and the UK, we systematically control for topic and proficiency level to isolate the cross-linguistic effects that each SL may exert on the HL. We automatically extract a wide range of linguistic complexity measures, including lexical, morphological, syntactic, discursive, and grammatical measures, and apply clustering-based undersampling to ensure balanced and representative data. Utilizing an explainable boosting machine, a class of inherently interpretable machine learning models, our approach identifies predictive patterns that discriminate between English- and German-influenced HL texts. The findings highlight distinct lexical and morphosyntactic patterns associated with each SL, with some patterns in the HL mirroring the structures of the SL. These results support the role of the SL in characterizing HL output. Beyond offering empirical evidence of cross-linguistic influence, this work demonstrates how interpretable machine learning can serve as an empirical test bed for language acquisition research.
  • Kirje
    UAM-CSI at MultiGEC-2025: Parameter-efficient LLM Fine-tuning for Multilingual Grammatical Error Correction
    (University of Tartu Library, 2025-03) Staruch, Ryszard; Muñoz Sánchez, Ricardo; Alfter, David; Volodina, Elena; Kallas, Jelena
    This paper describes the solution of the UAMCSI team to the shared task on Multilingual Grammatical Error Correction (MultiGEC-2025), which is part of the workshop on Natural Language Processing for Computer-Assisted Language Learning (NLP4CALL). The shared task covers 12 languages: Czech, English, Estonian, German, Greek, Icelandic, Italian, Latvian, Russian, Slovene, Swedish and Ukrainian. The aim of the task is to correct errors in the provided texts. Our system is a google/gemma-2-9b-it model with 2 QLoRA adapters, one for the minimal-edit track and another for the fluency-edit track. Our solution achieves the best performance on the test sets on GLEU and F0.5 metrics for all languages and the best performance on the Scribendi Score metric except for the Greek language in the minimal-edit track.
  • Kirje
    Lattice @MultiGEC-2025: A Spitful Multilingual Language Error Correction System Using LLaMA
    (University of Tartu Library, 2025-03) Seminck, Olga; Dupont, Yoann; Dehouck, Mathieu; Wang, Qi; Durandard, Noé; Novikov, Margo; Muñoz Sánchez, Ricardo; Alfter, David; Volodina, Elena; Kallas, Jelena
    This paper reports on our submission to the NLP4CALL shared task on Multilingual Grammatical Error Correction (MultiGEC-2025) (Masciolini et al., 2025). We developed two approaches: fine-tuning a large language model, LLaMA 3.0 (8B), for each MultiGEC corpus, and a pipeline based on the encoderbased language model XLM-RoBERTa. During development, the first method significantly outperformed the second, except for languages that are poorly supported by LLaMA 3.0 and have limited MultiGEC training data. Therefore, our official results for the shared task were produced using the neural network system for Slovenian, while fine-tuned LLaMA models were used for the eleven other languages. In this paper, we first introduce the shared task and its data. Next, we present our two approaches, as well as a method to detect cycles in the LLaMA output. We also discuss a number of hurdles encountered while working on the shared task.
  • Kirje
    The MultiGEC-2025 Shared Task on Multilingual Grammatical Error Correction at NLP4CALL
    (University of Tartu Library, 2025-03) Masciolini, Arianna; Caines, Andrew; De Clercq, Orphée; Kruijsbergen, Joni; Kurfalı, Murathan; Muñoz Sánchez, Ricardo; Volodina, Elena; Östling, Robert; Muñoz Sánchez, Ricardo; Alfter, David; Volodina, Elena; Kallas, Jelena
    This paper reports on MultiGEC-2025, the first shared task in text-level Multilingual Grammatical Error Correction. The shared task features twelve European languages (Czech, English, Estonian, German, Greek, Icelandic, Italian, Latvian, Russian, Slovene, Swedish and Ukrainian) and is organized into two tracks, one for systems producing minimally corrected texts, thus preserving as much as possible of the original language use, and one dedicated to systems that prioritize fluency and idiomaticity. We introduce the task setup, data, evaluation metrics and baseline; present results obtained by the submitted systems and discuss key takeaways and ideas for future work.
  • Kirje
    Proceedings of the 14th Workshop on Natural Language Processing for Computer Assisted Language Learning
    (University of Tartu Library, 2025-03) Muñoz Sánchez, Ricardo; Alfter, David; Volodina, Elena; Kallas, Jelena