Image-Text Relation Prediction for Multilingual Tweets

dc.contributor.authorRikters, Matīss
dc.contributor.authorMarrese-Taylor, Edison
dc.contributor.editorEinarsson, Hafsteinn
dc.contributor.editorSimonsen, Annika
dc.contributor.editorNielsen, Dan Saattrup
dc.coverage.spatialTallinn, Estonia
dc.date.accessioned2025-02-17T09:53:15Z
dc.date.available2025-02-17T09:53:15Z
dc.date.issued2025-03
dc.description.abstractVarious social networks have been allowing media uploads for over a decade now. Still, it has not always been clear what is their relation with the posted text or even if there is any at all. In this work, we explore how multilingual vision-language models tackle the task of image-text relation prediction in different languages, and construct a dedicated balanced benchmark data set from Twitter posts in Latvian along with their manual translations into English. We compare our results to previous work and show that the more recently released vision-language model checkpoints are becoming increasingly capable at this task, but there is still much room for further improvement.
dc.identifier.isbn978-9908-53-116-8
dc.identifier.urihttps://hdl.handle.net/10062/107161
dc.language.isoen
dc.publisherUniversity of Tartu Library
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleImage-Text Relation Prediction for Multilingual Tweets
dc.typeArticle

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
2025_nbreal_1_4.pdf
Suurus:
126.08 KB
Formaat:
Adobe Portable Document Format