Adaptive critic design in missile stabilisation and tracking problem

(For USM Staff/Student Only)

EngLib USM > Ω School of Aerospace Engineering >

Adaptive critic design in missile stabilisation and tracking problem

Adaptive critic design in missile stabilisation and tracking problem / Lim Jun Leong

Abstract

Pembelajaran pengukuhan merupakan salah satu pembelajaran mesin yang berupaya menjalankan proses pembelajaran sendiri melalui cara cuba-cuba. Antara cara pembelajaran pengukuhan, akto-kritik adalah terkenal dengan struktur yang sederhana serta boleh memastikan hasil latihan. Akto-kritik merupakan method pembezaan duniawi di mana agen mengandungi dua komponent, akto dan kritik. Akto akan mengasilkan polisi optimum manakala kritik akan mengkritik nilai anggaran sebagai penilaian prestasi tindakan yang diambil. Akto-kritik terkenal dengan struktur yang mudah dengan keperluan pengetahuan modal yang minimum. Dalam penyelidikan ini, keserasian dan kelestarian akto-kritik dalam masalah kawalan akan disiasatkan. Akto-kritik telah dilaksanakan dalam modal peluru berpandu, mencuba untuk menyelesaikan masalah penstabilan dan penjejakan. Rangkaian neural asingan telah dirancang dan dilatih untuk menganggar lestari akto, kritik dan modal sistem. Kemas kini rangkaian neural adalah dijalankan melalui implikasi perambatan mundur, dan keturunan kecerunan merupakan matematik pendekatan yang digunakan. Rangkai neural bagi modal sistem telah dilatih secara di luar talian terlebi dahulu. Akto-kritik telah diimplikasi dengan rangkaian sistem modal yang telah dilatih. Pada masa yang sama, modal sistem juga manjalani latihan secara atas talian. Seni bina ini telah berjaya menganggar dinamik peluru berpandu melalui rangkaian neural. Cara akto-kritik juga telah berjaya menyeleasaikan masalah penstabilan serta memangkas midal sistem tersebut. Malahan, percubaan selanjutnya diperlukan untuk menyelesaikan masalah penjejakan. Peningkatan boleh dilakukan melalui pengubahan struktur rangkaian neural, fungsi ganjaran atau fungsi pengaktifan neuron. _______________________________________________________________________________________________________ Reinforcement Learning (RL) is a handful machine learning branch which the machine is able to learn on its own through trial and error. Among reinforcement learning methods, actor critic is an infamous method, which has a simple structure and a promising training outcome. Actor-critic is a temporal difference method, where the agent consists of two parts, namely actor and critic. The actor works on the generation of an optimal policy while the critic generates an approximate value to evaluate the action taken. The actorcritic method is popular as it is simple and requires minimum knowledge of the plant model. In this research, the compatibility of the actor-critic in a control problem will the focus of the investigation. The actor-critic was implemented in a missile model and attempted on tackling a stabilisation problem and a tracking problem. Independent neural networks are designed and trained accordingly to approximate the actor, critic, and the plant model. The updating of the neural networks is through backpropagation, and the gradient descent method is the mathematic approach to achieve the updates of the neural networks. The system model neural network is first trained offline. The actor-critic is then trained online with the pre-trained system model. At the same time, the system model is also updated online. The architecture successfully approximated the missile dynamics using the neural network. Also, the actor-critic method successfully solved the stabilisation problem and the trimming of the missile model. Yet, a further attempt will be required for a tracking problem by using the mentioned updating method. Improvement could be done through the redesign of the neural network, reward function or the activation function for the neurons.

Contributor(s):

Lim Jun Leong - Author

Primary Item Type:

Final Year Project

Identifiers:

Accession Number : 875007945

Language:

English

Subject Keywords:

(RL); Actor-critic; missile

First presented to the public:

8/1/2020

Original Publication Date:

10/6/2020

Previously Published By:

Universiti Sains Malaysia

Place Of Publication:

School of Aerospace Engineering

Citation:

Extents:

Number of Pages - 70

License Grantor / Date Granted:

/ ( View License )

Date Deposited

2020-10-06 14:44:52.72

Submitter:

Mohd Jasnizam Mohd Salleh

All Versions

Thumbnail	Name	Version	Created Date
	Adaptive critic design in missile stabilisation and tracking problem	1	2020-10-06 14:44:52.72

Engineering Library USM @ 2017

Reason for withdraw :*
Display metadata:
Withdraw all versions: