Autonomous maze navigation using dual-layer q-learning based hierarchical reinforcement learning and relative states

(For USM Staff/Student Only)

EngLib USM > Ω School of Aerospace Engineering >

Autonomous maze navigation using dual-layer q-learning based hierarchical reinforcement learning and relative states

Autonomous maze navigation using dual-layer q-learning based hierarchical reinforcement learning and relative states / Lee Meng Kuan

Abstract

Navigasi labirin autonomi semakin penting dalam bidang perancangan jalan. Penerapan navigasi autonomi di dunia sebenar seperti dalam misi perlombongan dan penyelamatan di mana akses manusia dibatasi. Projek ini menerapkan pembelajaran peneguhan untuk menyelesaikan navigasi labirin autonomi dengan kenderaan darat 2D di persekitaran dalaman atau kawasan yang tidak dilayari oleh GPS. Q-learning digunakan sebagai algoritma perancangan jalan. Satu kajian awal dilakukan dalam projek ini menggunakan ‘Q-learning’ dengan keadaan mutlak, atau lokasi geografi robot darat diketahui. Prestasi keadaan mutlak dengan dan tanpa arah dianggap dikaji. Selain itu, ‘Q-learning’ dengan keadaan relatif memilih untuk menggantikan keadaan mutlak dalam navigasi dalaman di mana lokasi geografi robot mungkin tidak dapat diperoleh sepenuhnya. Satu ciri khas dari keadaan relatif dapat mengawal bilangan keadaan yang bergantung kepada ukuran masalah. Tesis ini mendapati bahawa keadaan relatif mempunyai kadar penumpuan yang lebih baik, kurang masa pengkomputeran daripada keadaan mutlak biasa. Satu eksperimen dijalankan untuk mengesahkan keadaan relatif dalam aplikasi sebenar. Eksperimen ini selanjutnya membuktikan bahawa kebolehpindahan pengetahuan dari simulasi sebagai dasar awal ke aplikasi sebenarnya. Walau bagaimanapun, keadaan relatif mungkin mengalami keadaan serupa yang dilihat oleh ejen di lokasi berlainan labirin, jika persekitarannya kelihatan sama dengan ejen tersebut. Terakhir, keadaan relatif bertambah baik dengan kerangka pembelajaran ‘Dual-Layer Q-learning Based Hierarchical Reinforcement Learning’ (DLQHRL) yang baru dicadangkan. DLQHRL akan menggabungkan keadaan relatif sebagai keadaan asas dan keadaan mutlak dari kedudukan geografi yang disediakan oleh GPS untuk menyelesaikan masalah penumpuan di keadaan relatif. Rangka kerja DLQHRL menyediakan ejen dengan ciri penumpuan pantas dari keadaan relatif dan kestabilan dari keadaan mutlak. Projek ini telah menunjukkan bahawa keadaan relatif unggul daripada keadaan mutlak dari segi kadar penumpuan dan masa pengiraan di kawasan yang tidak dilayari oleh GPS. Keadaan relatif juga menegaskan kebolehpercayaan dalam aplikasi sebenar dengan eksperimen, dan DLQHRL yang baru telah membantu meningkatkan kadar penumpuan keadaan relatif di pembelajaran peneguhan rata. _______________________________________________________________________________________________________ Autonomous maze navigation increasingly becoming vital in the path planning field. The application of autonomous navigation in the real-world such as in mining and rescue mission where human access is restricted. This project applies reinforcement learning to solve the autonomous mage navigation with 2D ground vehicles in indoor environments or GPS-denied areas. Q-learning is used as the path planning algorithm. A preliminary study was done in this project using Q-learning with absolute states, or the geographical location of the ground robot is known. The performance of absolute states with and without heading being considered studied. Also, Q-learning with relative states opted to replace absolute states in the indoor navigation where the robot’s geographical location might not be fully obtainable. One striking feature of relative states it can control the number of states independent to the size of the problem. This thesis found out that relative states have better convergence rate, less computational time than the typical absolute states. An experiment was conducted to validate the relative states in the actual application. The experiment was further proved that the transferability of the knowledge from the simulation as the initial policy to the actual application. However, the relative state found out to be suffered from similar states perceive by the agent in the different maze locations if the environment appeared to be the same as the agent. Lastly, the relative state is further improved with the new proposed Dual-Layer Q-learning based Hierarchical Reinforcement Learning (DLQHRL) framework. The DLQHRL will combine the relative states as the fundamental states and the absolute states from the geographical position provide by the GPS to solve the convergence problem in the relative states. DLQHRL framework provides the agent with fast convergence feature from the relative states and stability from the absolute states. This project has manifested that relative states excel than the absolute states in term of the convergence rate and computational time in the GPS-denied area. Relative states also affirm the reliability in the actual application with the experimental work, and the novel DLQHRL has helped to improve the convergence rate of the relative states in the flat RL.

Contributor(s):

Lee Meng Kuan - Author

Primary Item Type:

Final Year Project

Identifiers:

Accession Number : 875007945

Language:

English

Subject Keywords:

navigation; rescue; 2D

First presented to the public:

10/1/2020

Original Publication Date:

10/5/2020

Previously Published By:

Universiti Sains Malaysia

Place Of Publication:

School of Aerospace Engineering

Citation:

Extents:

Number of Pages - 126

License Grantor / Date Granted:

/ ( View License )

Date Deposited

2020-10-05 15:48:30.75

Submitter:

Mohd Jasnizam Mohd Salleh

All Versions

Thumbnail	Name	Version	Created Date
	Autonomous maze navigation using dual-layer q-learning based hierarchical reinforcement learning and relative states	1	2020-10-05 15:48:30.75

Engineering Library USM @ 2017

Reason for withdraw :*
Display metadata:
Withdraw all versions: