From Words to Action: A National Initiative to Overcome Data Scarcity for the Slovene LLM
dc.contributor.author | Holdt, Špela Arhar | |
dc.contributor.author | Antloga, Špela | |
dc.contributor.author | Munda, Tina | |
dc.contributor.author | Pori, Eva | |
dc.contributor.author | Krek, Simon | |
dc.contributor.editor | Tudor, Crina Madalina | |
dc.contributor.editor | Debess, Iben Nyholm | |
dc.contributor.editor | Bruton, Micaella | |
dc.contributor.editor | Scalvini, Barbara | |
dc.contributor.editor | Ilinykh, Nikolai | |
dc.contributor.editor | Holdt, Špela Arhar | |
dc.coverage.spatial | Tallinn, Estonia | |
dc.date.accessioned | 2025-02-14T10:36:30Z | |
dc.date.available | 2025-02-14T10:36:30Z | |
dc.date.issued | 2025-03 | |
dc.description.abstract | Large Language Models (LLMs) have demonstrated significant potential in natural language processing, but they depend on vast, diverse datasets, creating challenges for languages with limited resources. The paper presents a national initiative that addresses these challenges for Slovene. We outline strategies for large-scale text collection, including the creation of an online platform to engage the broader public in contributing texts and a communication campaign promoting openly accessible and transparently developed LLMs. | |
dc.description.uri | https://aclanthology.org/2025.resourceful-1.0/ | |
dc.identifier.uri | https://hdl.handle.net/10062/107125 | |
dc.language.iso | en | |
dc.publisher | University of Tartu Library | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
dc.title | From Words to Action: A National Initiative to Overcome Data Scarcity for the Slovene LLM | |
dc.type | Article |
Failid
Originaal pakett
1 - 1 1
Laen...
- Nimi:
- 2025_resourceful_1_27.pdf
- Suurus:
- 176.27 KB
- Formaat:
- Adobe Portable Document Format