Proceedings of the 1st Workshop on Nordic-Baltic Responsible Evaluation and Alignment of Language Models (NB-REAL 2025)
Selle kollektsiooni püsiv URIhttps://hdl.handle.net/10062/107156
Sirvi
Sirvi Proceedings of the 1st Workshop on Nordic-Baltic Responsible Evaluation and Alignment of Language Models (NB-REAL 2025) Autor "Mikkelsen, Agnes Aggergaard" järgi
Nüüd näidatakse 1 - 1 1
- Tulemused lehekülje kohta
- Sorteerimisvalikud
Kirje The Danish Idiom Dataset: A collection of 1000 Danish idioms and fixed expressions(University of Tartu Library, 2025-03) Sørensen, Nathalie Hau; Nimb, Sanni; Mikkelsen, Agnes Aggergaard; Jensen, Jonas; Einarsson, Hafsteinn; Simonsen, Annika; Nielsen, Dan SaattrupInterpreting idiomatic expressions is a challenging task for learners and LLMs alike, as their meanings cannot be deduced directly from their individual components and often reflect nuances that are specific to the language in question. This makes idiom interpretation an ideal task for assessing the linguistic proficiency of large language models (LLMs). In order to test how LLMs handle this task, we introduce a new dataset comprising 1000 Danish idiomatic expressions sourced from the Danish Dictionary DDO (ordnet.dk/ddo). The dataset has been made publicly available at sprogteknologi.dk. For each expression, the dataset includes a correct dictionary definition, a literal false definition, a figurative false definition, and a random false definition. In the paper, we also present three experiments that demonstrate diverse applications of the dataset and aim to evaluate how well LLMs are able to identify the correct meanings of idiomatic expressions.