"I Need More Context and an English Translation": Analysing How LLMs Identify Personal Information in Komi, Polish, and English
dc.contributor.author | Ilinykh, Nikolai | |
dc.contributor.author | Szawerna, Maria Irena | |
dc.contributor.editor | Tudor, Crina Madalina | |
dc.contributor.editor | Debess, Iben Nyholm | |
dc.contributor.editor | Bruton, Micaella | |
dc.contributor.editor | Scalvini, Barbara | |
dc.contributor.editor | Ilinykh, Nikolai | |
dc.contributor.editor | Holdt, Špela Arhar | |
dc.coverage.spatial | Tallinn, Estonia | |
dc.date.accessioned | 2025-02-14T10:45:50Z | |
dc.date.available | 2025-02-14T10:45:50Z | |
dc.date.issued | 2025-03 | |
dc.description.abstract | Automatic identification of personal information (PI) is particularly difficult for languages with limited linguistic resources. Recently, large language models (LLMs) have been applied to various tasks involving low-resourced languages, but their capability to process PI in such contexts remains under-explored. In this paper we provide a qualitative analysis of the outputs from three LLMs prompted to identify PI in texts written in Komi (Permyak and Zyrian), Polish, and English. Our analysis highlights challenges in using pre-trained LLMs for PI identification in both low- and medium-resourced languages. It also motivates the need to develop LLMs that understand the differences in how PI is expressed across languages with varying levels of availability of linguistic resources. | |
dc.description.uri | https://aclanthology.org/2025.resourceful-1.0/ | |
dc.identifier.uri | https://hdl.handle.net/10062/107129 | |
dc.language.iso | en | |
dc.publisher | University of Tartu Library | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
dc.title | "I Need More Context and an English Translation": Analysing How LLMs Identify Personal Information in Komi, Polish, and English | |
dc.type | Article |
Failid
Originaal pakett
1 - 1 1
Laen...
- Nimi:
- 2025_resourceful_1_32.pdf
- Suurus:
- 380.43 KB
- Formaat:
- Adobe Portable Document Format