Predicting Depression Symptoms Based on Reddit Posts

Koljal, Kaire

Predicting Depression Symptoms Based on Reddit Posts

Failid

Koljal_ComputerScience_2022.pdf (665.53 KB)

Kuupäev

2022

Autorid

Koljal, Kaire

Kirjastaja

Tartu Ülikool

Abstrakt

Using social media posts to predict mental health problems has become a popular topic in Natural Language Processing (NLP). Machine learning has been used for detecting a diagnosis or single symptoms associated with depression. As the clinical picture of depression can differ for people, it is better to detect symptoms instead of diagnosis from the social media posts. In this work, depression symptoms are predicted based on posts from Reddit page r/depression using NLP methods and multi-label classification. This work focuses on evaluating the quality of the annotations and analysing if such data can be used to train a predictive model. Each post is annotated by three annotators and the labels are aggregated in three ways to create three datasets that are used to train Transformers models. The results of this work reveal that on a small dataset with a lower annotation agreement, a majority vote over annotations gives the most reliable dataset and results. RoBERTa model shows the best learning and generalization ability in this work.

Märksõnad

Multi-label classification, Transformers, symptom prediction, depression, social media

URI

https://hdl.handle.net/10062/91714

Kollektsioonid

MTAT magistritööd – Master's theses

Kirje täielik lehekülg

Predicting Depression Symptoms Based on Reddit Posts

Failid

Kuupäev

Autorid

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

Abstrakt

Kirjeldus

Märksõnad

Viide

URI

Kollektsioonid