Grammatiliste vigade parandamine sageduspõhise sünteetilise andmestikuga

dc.contributor.advisorLuhtaru, Agnes, juhendaja
dc.contributor.advisorFišel, Mark, juhendaja
dc.contributor.authorUniver, Jakob
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-08-23T06:28:16Z
dc.date.available2023-08-23T06:28:16Z
dc.date.issued2022
dc.description.abstractIn this thesis we introduce a grammatical error correction method with a neural network trained only on synthetic data. The method is useful for languages without big corpora for training a grammatical error correction model, like Estonian. From a smaller human corrected corpus, we found the probabilities of word deletion, addition, substitution and changing word order mistakes in the text. With the help of these probabilities we created a bigger synthetic corpus and we trained a neural network for grammatical error correction on the synthetic data. The author found that the probabilities of mistakes do not have to be very precise and the trained neural network can correct spelling mistakes as well as grammar mistakes.et
dc.identifier.urihttps://hdl.handle.net/10062/91682
dc.language.isoestet
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectGrammatcal Error Correctionet
dc.subjectneural networket
dc.subjectsynthetic dataet
dc.subject.otherbakalaureusetöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleGrammatiliste vigade parandamine sageduspõhise sünteetilise andmestikugaet
dc.typeThesiset

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Univer_Informaatika_2022.pdf
Size:
188.62 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: