Multi-label Scandinavian Language Identification (SLIDE)
dc.contributor.author | Fedorova, Mariia | |
dc.contributor.author | Frydenberg, Jonas Sebulon | |
dc.contributor.author | Handford, Victoria | |
dc.contributor.author | Langø, Victoria Ovedie Chruickshank | |
dc.contributor.author | Willoch, Solveig Helene | |
dc.contributor.author | Midtgaard, Marthe Løken | |
dc.contributor.author | Scherrer, Yves | |
dc.contributor.author | Mæhlum, Petter | |
dc.contributor.author | Samuel, David | |
dc.contributor.editor | Tudor, Crina Madalina | |
dc.contributor.editor | Debess, Iben Nyholm | |
dc.contributor.editor | Bruton, Micaella | |
dc.contributor.editor | Scalvini, Barbara | |
dc.contributor.editor | Ilinykh, Nikolai | |
dc.contributor.editor | Holdt, Špela Arhar | |
dc.coverage.spatial | Tallinn, Estonia | |
dc.date.accessioned | 2025-02-14T10:48:13Z | |
dc.date.available | 2025-02-14T10:48:13Z | |
dc.date.issued | 2025-03 | |
dc.description.abstract | Identifying closely related languages at sentence level is difficult, in particular because it is often impossible to assign a sentence to a single language. In this paper, we focus on multi-label sentence-level Scandinavian language identification (LID) for Danish, Norwegian Bokmål, Norwegian Nynorsk, and Swedish. We present the Scandinavian Language Identification and Evaluation, SLIDE, a manually curated multi-label evaluation dataset and a suite of LID models with varying speed–accuracy tradeoffs. We demonstrate that the ability to identify multiple languages simultaneously is necessary for any accurate LID method, and present a novel approach to training such multi-label LID models. | |
dc.description.uri | https://aclanthology.org/2025.resourceful-1.0/ | |
dc.identifier.uri | https://hdl.handle.net/10062/107130 | |
dc.language.iso | en | |
dc.publisher | University of Tartu Library | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
dc.title | Multi-label Scandinavian Language Identification (SLIDE) | |
dc.type | Article |
Failid
Originaal pakett
1 - 1 1
Laen...
- Nimi:
- 2025_resourceful_1_33.pdf
- Suurus:
- 245.49 KB
- Formaat:
- Adobe Portable Document Format