Image-Text Relation Prediction for Multilingual Tweets

Rikters, Matīss; Marrese-Taylor, Edison

Image-Text Relation Prediction for Multilingual Tweets

Failid

2025_nbreal_1_4.pdf (126.08 KB)

Kuupäev

2025-03

Autorid

Rikters, Matīss

Marrese-Taylor, Edison

Kirjastaja

University of Tartu Library

Abstrakt

Various social networks have been allowing media uploads for over a decade now. Still, it has not always been clear what is their relation with the posted text or even if there is any at all. In this work, we explore how multilingual vision-language models tackle the task of image-text relation prediction in different languages, and construct a dedicated balanced benchmark data set from Twitter posts in Latvian along with their manual translations into English. We compare our results to previous work and show that the more recently released vision-language model checkpoints are becoming increasingly capable at this task, but there is still much room for further improvement.

URI

https://hdl.handle.net/10062/107161

Kollektsioonid

Proceedings of the 1st Workshop on Nordic-Baltic Responsible Evaluation and Alignment of Language Models (NB-REAL 2025)

Kirje täielik lehekülg

Image-Text Relation Prediction for Multilingual Tweets

Failid

Kuupäev

Autorid

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

Abstrakt

Kirjeldus

Märksõnad

Viide

URI

Kollektsioonid