Exploring Smartphone-Based Reinforcement Learning Control for Educational Robotics: Implementation on OpenBot

Gras, Lilou

Exploring Smartphone-Based Reinforcement Learning Control for Educational Robotics: Implementation on OpenBot

dc.contributor.advisor	Muhammad, Naveed
dc.contributor.author	Gras, Lilou
dc.contributor.other	Tartu Ülikool. Loodus- ja täppisteaduste valdkond	et
dc.contributor.other	Tartu Ülikool. Tehnoloogiainstituut	et
dc.date.accessioned	2025-03-11T17:35:51Z
dc.date.available	2025-03-11T17:35:51Z
dc.date.issued	2024
dc.description	See uurib võimalust rakendada Tugevdamisõppe (RL) algoritme täielikult nutitelefonis, et juhtida hariduslikku robotplatvormi OpenBot. Selle uuringu eesmärk on välja selgitada, kas RL-i saab teostada Android-nutitelefonides ilma simuleeritud keskkondadeta ja kas see oleks õpilastele ja entusiastidele praktilise RL-projektina kättesaadav. Algselt testiti Sügav Q-Õpe (DQL) ja Poliitikagradiendi (PG) algoritme standardsete RL-stsenaariumide Cartpole ja Pong abil. See võimaldas saada ülevaate mõlemast algoritmist ja sellest, mida edukas RL-koolituses oodata. Seejärel rakendati poliitikagradiendi algoritm täielikult OpenBoti juhtivas nutitelefonis, et sõita 15 sekundi jooksul üle raja. Üldiselt suudab agent pärast ligikaudu 400 poliitikagradienti kasutavat koolitusepisoodi edukalt navigeerida rajal sihitud 15 sekundi jooksul pooltel katsetest. Vaatamata uuringu julgustavatele tulemustele on mõned tehnilised väljakutsed endiselt lahtised, nagu plahvatuslikud gradiendid, kaalu lähtestamise juhuslikkus ja inseneriprobleemid, nagu suur akukulu.
dc.description.abstract	This research explores the feasibility of implementing Reinforcement Learning (RL) algorithms entirely on a smartphone to control an educational robotic platform, OpenBot. This study aims to determine if RL can be executed on Android smartphones without simulated environments and whether it would be accessible for students and enthusiasts as a practical RL project. Initially, Deep Q-Learning (DQL) and Policy Gradient (PG) algorithms were tested on standard RL scenarios, Cartpole and Pong. This allowed to gain insights on both algorithms and what to expect in a successful RL training. The policy gradient algorithm was then implemented entirely on the smartphone controlling OpenBot to drive across a track for 15 seconds. In general, after approximately 400 episodes of training using policy gradient, the agent was able to successfully navigate the track for the aimed 15 seconds in half of its attempts. Despite the encouraging results of the study, some technical challenges remain open, such as, exploding gradients, the randomness of weight initialization, and engineering challenges such as high battery consumption.
dc.identifier.uri	https://hdl.handle.net/10062/107717
dc.language.iso	en
dc.publisher	Tartu Ülikool	et
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Estonia	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/ee/
dc.subject	reinforcement learning
dc.subject	control
dc.subject	robotics
dc.subject	openbot
dc.subject	policy gradient
dc.subject.other	magistritööd	et
dc.title	Exploring Smartphone-Based Reinforcement Learning Control for Educational Robotics: Implementation on OpenBot
dc.title.alternative	Nutitelefonil põhineva tugevdusõppe juhtimise uurimine haridusrobotikas: rakendamine OpenBotil
dc.type	Thesis	en

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1

Nimi:: Gras_MSc2024.pdf
Suurus:: 3.86 MB
Formaat:: Adobe Portable Document Format

Lae alla

Kollektsioonid

Robotics and Computer Engineering - Master's theses