Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
Selle kollektsiooni püsiv URIhttps://hdl.handle.net/10062/107190
Sirvi
Sirvi Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025) Autor "Ármannsson, Bjarki" järgi
Nüüd näidatakse 1 - 2 2
- Tulemused lehekülje kohta
- Sorteerimisvalikud
Kirje An Icelandic Linguistic Benchmark for Large Language Models(University of Tartu Library, 2025-03) Ármannsson, Bjarki; Ingimundarson, Finnur Ágúst; Sigurðsson, Einar Freyr; Johansson, Richard; Stymne, SaraThis paper introduces a linguistic benchmark for Icelandic-language LLMs, the first of its kind manually constructed by native speakers. We report on the scores obtained by current state-of-the-art models, which indicate room for improvement, and discuss the theoretical problems involved in creating such a benchmark and scoring a model's performance.Kirje Playing by the Rules: A Benchmark Set for Standardized Icelandic Orthography(University of Tartu Library, 2025-03) Ármannsson, Bjarki; Hafsteinsson, Hinrik; Sigtryggsson, Jóhannes B.; Jasonarson, Atli; Sigurðsson, Einar Freyr; Steingrímsson, Steinþór; Johansson, Richard; Stymne, SaraWe present the Icelandic Standardization Benchmark Set: Spelling and Punctuation (IceStaBS:SP), a dataset designed to provide standardized text examples for Icelandic orthography. The dataset includes non-standard orthography examples and their standardized counterparts, along with detailed explanations based on official Icelandic spelling rules. IceStaBS:SP aims to support the development and evaluation of automatic spell and grammar checkers, particularly in educational settings. We evaluate various spell and grammar checkers using IceStaBS:SP, demonstrating its utility as a benchmarking tool and highlighting areas for future improvement.