Fine-Tuning Cross-Lingual LLMs for POS Tagging in Code-Switched Contexts

Absar, Shayaan

Fine-Tuning Cross-Lingual LLMs for POS Tagging in Code-Switched Contexts

Failid

2025_resourceful_1_2.pdf (775.28 KB)

Kuupäev

2025-03

Autorid

Absar, Shayaan

Kirjastaja

University of Tartu Library

Abstrakt

Code-switching (CS) involves speakers switching between two (or potentially more) languages during conversation and is a common phenomenon in bilingual communities. The majority of NLP research has been devoted to mono-lingual language modelling. Consequentially, most models perform poorly on code-switched data. This paper investigates the effectiveness of Cross-Lingual Large Language Models on the task of POS (Part-of-Speech) tagging in code-switched contexts, once they have undergone a fine-tuning process. The models are trained on code-switched combinations of Indian languages and English. This paper also seeks to investigate whether fine-tuned models are able to generalise and POS tag code-switched combinations that were not a part of the fine-tuning dataset. Additionally, this paper presents a new metric, the S-index (Switching-Index), for measuring the level of code-switching within an utterance.

URI

https://aclanthology.org/2025.resourceful-1.0/
https://hdl.handle.net/10062/107110

Kollektsioonid

Proceedings of the Third Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2025)

Kirje täielik lehekülg

Fine-Tuning Cross-Lingual LLMs for POS Tagging in Code-Switched Contexts

Failid

Kuupäev

Autorid

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

Abstrakt

Kirjeldus

Märksõnad

Viide

URI

Kollektsioonid