Masinõppe mudelite hindamine väheste märgenditega andmetel

Aun, Mart-Mihkel

Masinõppe mudelite hindamine väheste märgenditega andmetel

dc.contributor.advisor	Laur, Sven, juhendaja
dc.contributor.author	Aun, Mart-Mihkel
dc.contributor.other	Tartu Ülikool. Loodus- ja täppisteaduste valdkond	et
dc.contributor.other	Tartu Ülikool. Arvutiteaduse instituut	et
dc.date.accessioned	2024-10-07T11:13:33Z
dc.date.available	2024-10-07T11:13:33Z
dc.date.issued	2023
dc.description.abstract	Machine learning models used to solve classification tasks are evaluated using quality measures such as accuracy, precision, and recall. These measures or their estimates are calculated through the class labels of data points and the classifications of the method on those data points. To find the actual class labels, they must be manually reviewed. Often, quality measures are evaluated using a finite sample, and the obtained estimates obtained errors. In this thesis, the necessary sample size was derived, to not exceed the limit of estimation error with a certain confidence level. In addition, for a sample, the definition-based way of finding the accuracy, precision, or recall of all the sample data points’ labels must be determined. If another method exists in addition to the method being evaluated, it can be used for a new evaluation. In this case, it is possible to reduce the amount of manual work required for labeling by examining how much better the new method is than the old one instead of calculating the quality measures of the new method. This thesis explored techniques that help to reduce the number of data points that require labeling for the evaluation of the quality measures of the two classification methods.
dc.identifier.uri	https://hdl.handle.net/10062/105225
dc.language.iso	et
dc.publisher	Tartu Ülikool	et
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Estonia	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/ee/
dc.subject	machine learning
dc.subject	classification
dc.subject	probability theory
dc.subject	statistics
dc.subject	accruacy
dc.subject	2 precision
dc.subject	recall
dc.subject.other	bakalaureusetööd	et
dc.subject.other	informaatika	et
dc.subject.other	infotehnoloogia	et
dc.subject.other	informatics	en
dc.subject.other	infotechnology	en
dc.title	Masinõppe mudelite hindamine väheste märgenditega andmetel
dc.type	Thesis

Failid

Originaal pakett

Nüüd näidatakse 1 - 2 2

Nimi:: Aun_informaatika_2023.pdf
Suurus:: 539.86 KB
Formaat:: Adobe Portable Document Format

Lae alla

Nimi:: lisad.zip
Suurus:: 297.07 KB
Formaat:: Compressed ZIP

Lae alla

Kollektsioonid

MTAT bakalaureusetööd – Bachelor's theses