Fine-Tuning Cross-Lingual LLMs for POS Tagging in Code-Switched Contexts
dc.contributor.author | Absar, Shayaan | |
dc.contributor.editor | Tudor, Crina Madalina | |
dc.contributor.editor | Debess, Iben Nyholm | |
dc.contributor.editor | Bruton, Micaella | |
dc.contributor.editor | Scalvini, Barbara | |
dc.contributor.editor | Ilinykh, Nikolai | |
dc.contributor.editor | Holdt, Špela Arhar | |
dc.coverage.spatial | Tallinn, Estonia | |
dc.date.accessioned | 2025-02-14T09:41:38Z | |
dc.date.available | 2025-02-14T09:41:38Z | |
dc.date.issued | 2025-03 | |
dc.description.abstract | Code-switching (CS) involves speakers switching between two (or potentially more) languages during conversation and is a common phenomenon in bilingual communities. The majority of NLP research has been devoted to mono-lingual language modelling. Consequentially, most models perform poorly on code-switched data. This paper investigates the effectiveness of Cross-Lingual Large Language Models on the task of POS (Part-of-Speech) tagging in code-switched contexts, once they have undergone a fine-tuning process. The models are trained on code-switched combinations of Indian languages and English. This paper also seeks to investigate whether fine-tuned models are able to generalise and POS tag code-switched combinations that were not a part of the fine-tuning dataset. Additionally, this paper presents a new metric, the S-index (Switching-Index), for measuring the level of code-switching within an utterance. | |
dc.identifier.uri | https://aclanthology.org/2025.resourceful-1.0/ | |
dc.identifier.uri | https://hdl.handle.net/10062/107110 | |
dc.language.iso | en | |
dc.publisher | University of Tartu Library | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
dc.title | Fine-Tuning Cross-Lingual LLMs for POS Tagging in Code-Switched Contexts | |
dc.type | Article |
Failid
Originaal pakett
1 - 1 1