Predicting Cognitive Distortions from Reddit Posts by Using Supervised Machine Learning Methods

Grents, Linda Katariina

Predicting Cognitive Distortions from Reddit Posts by Using Supervised Machine Learning Methods

dc.contributor.advisor	Sirts, Kairit, juhendaja
dc.contributor.author	Grents, Linda Katariina
dc.contributor.other	Tartu Ülikool. Loodus- ja täppisteaduste valdkond	et
dc.contributor.other	Tartu Ülikool. Arvutiteaduse instituut	et
dc.date.accessioned	2023-08-25T06:14:18Z
dc.date.available	2023-08-25T06:14:18Z
dc.date.issued	2022
dc.description.abstract	Importance of mental health has gained great attention in modern societies. People have become more open about discussing their thoughts with the public, especially online. One platform that people are using it for is Reddit. The aim of this thesis is to predict cognitive distortions from the texts retrieved from the Anxiety sub-reddit. Cognitive distortions are important to detect as they can potentially have a negative impact on people’s lives. Predic-tions in this work are made by using supervised machine learning methods, such as logistic regression, support vector machine and fasttext (also with pre-trained word vectors). In ad-dition, inter-annotator agreement between annotators is being assessed with Cohen’s Kappa and Krippendorff’s Alpha. The results show that predicting cognitive distortions from the text is a challenge on its own, since the classifiers were not able to produce satisfactory results. This corresponds to related works where predicting different types of distortions have not given very good results. It is assumed that it would be more reasonable to predict the existence of cognitive distortions from the text rather than predicting different types of distortions, as this prediction shows better results. Predicting the existence of some distor-tion might be of more help to people suffering from anxiety or depression. It might also be useful to predict only the most prevalent distortions from the text, as some distortions are probably more prevalent than others. It is important to note that major constraint in this work is related to the dataset, as it is relatively small in size and noisy. If there is a need to predict different types of cognitive distortions, it is suggested to use a larger dataset of better quality. However, this remains a challenge on its own in natural language processing and clinical psychology research area.	et
dc.identifier.uri	https://hdl.handle.net/10062/91751
dc.language.iso	eng	et
dc.publisher	Tartu Ülikool	et
dc.rights	openAccess	et
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Cognitive distortions	et
dc.subject	mental health	et
dc.subject	AI	et
dc.subject	NLP	et
dc.subject	Reddit	et
dc.subject.other	magistritööd	et
dc.subject.other	informaatika	et
dc.subject.other	infotehnoloogia	et
dc.subject.other	informatics	et
dc.subject.other	infotechnology	et
dc.title	Predicting Cognitive Distortions from Reddit Posts by Using Supervised Machine Learning Methods	et
dc.type	Thesis	et

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1

Nimi:: Grents_DataScience_2022.pdf
Suurus:: 1.21 MB
Formaat:: Adobe Portable Document Format
Kirjeldus:

Lae alla

Litsentsi pakett

Nüüd näidatakse 1 - 1 1

Nimi:: license.txt
Suurus:: 1.71 KB
Formaat:: Item-specific license agreed upon to submission
Kirjeldus:

Lae alla

Kollektsioonid

MTAT magistritööd – Master's theses