Large Language Models as Annotators of Named Entities in Climate Change and Biodiversity: A Preliminary Study

dc.contributor.authorVolkanovska, Elena
dc.contributor.editorBasile, Valerio
dc.contributor.editorBosco, Cristina
dc.contributor.editorGrasso, Francesca
dc.contributor.editorIbrahim, Muhammad Okky
dc.contributor.editorSkeppstedt, Maria
dc.contributor.editorStede, Manfred
dc.coverage.spatialTallinn, Estonia
dc.date.accessioned2025-02-17T11:39:40Z
dc.date.available2025-02-17T11:39:40Z
dc.date.issued2025-03
dc.description.abstractThis paper examines whether few-shot techniques for Named Entity Recognition (NER) utilising existing large language models (LLMs) as their backbone can be used to reliably annotate named entities (NEs) in scientific texts on climate change and biodiversity. A series of experiments aim to assess whether LLMs can be integrated into an end-to-end pipeline that could generate token- or sentence-level NE annotations; the former being an ideal-case scenario that allows for seamless integration of existing with new token-level features in a single annotation pipeline. Experiments are run on four LLMs, two NER datasets, two input and output data formats, and ten and nine prompt versions per dataset. The results show that few-shot methods are far from being a silver bullet for NER in highly specialised domains, although improvement in LLM performance is observed for some prompt designs and some NE classes. Few-shot methods would find better use in a human-in-the-loop scenario, where an LLM's output is verified by a domain expert.
dc.identifier.isbn978-9908-53-114-4
dc.identifier.urihttps://hdl.handle.net/10062/107179
dc.language.isoen
dc.publisherUniversity of Tartu Library
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleLarge Language Models as Annotators of Named Entities in Climate Change and Biodiversity: A Preliminary Study
dc.typeArticle

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
2025_nlp4ecology_1_7.pdf
Suurus:
262.18 KB
Formaat:
Adobe Portable Document Format