(For USM Staff/Student Only)

EngLib USM > Ω School of Electrical & Electronic Engineering >

Integration of unsupervised clustering algorithms and supervised classifiers for pattern recognition / Leong Shi Xiang

Integration of unsupervised clustering algorithms and supervised classifiers for pattern recognition_ Leong Shi Xiang_
Dalam dunia sebenar, masalah pengecaman corak adalah dalam pelbagai bentuk dan kritikal dalam kebanyakan tugas membuat keputusan dalam kalangan manusia. Dalam sistem pengecaman corak, adalah amat penting untuk mencapai ketepatan yang tinggi. Terdapat dua jenis kaedah pengecaman corak iaitu secara terselia atau tanpa selia. Masalah dalam penggunaan pengecaman corak tanpa selia adalah ia memerlukan ‘guru’ semasa proses pengelasan. Selain itu, ia perlu belajar sendiri yang boleh menyebabkan penghasilan pengelasan yang lemah. Untuk pengelasan terselia, bagi mencapai ketepatan yang tinggi, data latihan terlabel dengan jumlah yang besar dikehendaki semasa proses pengelasan.Walau bagaimanapun, dalam kehidupan yang sebenar, proses pelabelan memerlukan masa yang panjang dan biasanya dilakukan secara manual. Bagi menyelesaikan masalah-masalah tersebut, integrasi algoritma pengelompokan tanpa selia dan pengelas terselia dicadangkan. Objektif bagi penyelidikan ini adalah untuk mengkaji keupayaan sistem integrasi yang dicadangkan dalam proses pengelasan corak. Bagi mencapai objektif tersebut, penyelidikan ini dibahagikan kepada dua fasa. Fasa yang pertama adalah bertujuan untuk menilai prestasi algoritma pengelompokan. Manakala, fasa yang kedua adalah bertujuan untuk mengkaji prestasi sistem integrasi yang dicadangkan di mana pengelas Naïve Bayes digunakan. Data yang telah dikelompokkan dalam fasa pertama digunakan sebagai data latihan. Dengan menggunakan sistem integrasi yang dicadangkan, limitasi pengelompokan tanpa selia dapat diatasi. Bagi pengelas terselia, masa pelabelan dapat dikurangkan dan lebih banyak contoh latihan dapat dilabel. Di dalam penyelidikan ini, ketepatan pengecaman corak telah berjaya ditingkatkan. Sebagai contoh, berbanding dengan hanya menggunakan algoritma pengelompokan tanpa selia, selepas menggunakan sistem integrasi yang dicadangkan, ketepatan pengelasan bagi set data Fisher’s Iris, Wine dan Bacteria18Class telah meningkat masing-masing dari 88.67% kepada 96.00%, dari 78.33% kepada 83.45% dan dari 93.33% kepada 94.67%. Keputusan itu menunjukkan bahawa sistem integrasi yang dicadangkan boleh diterima dan boleh meningkatkan prestasi. Walau bagaimanapun, kajian lanjut diperlukan dalam ciri pengekstrakan dan di bahagian pengelompakan kerana prestasi pengelasan corak masih bergantung kepada ketepatan data masukan. In a real world, pattern recognition problems in diversified forms are ubiquitous and are critical in most human decision making tasks. In pattern recognition system, achieving high accuracy in pattern classification is crucial. There are two general paradigms for pattern recognition classification which are supervised and unsupervised learning. The problems in applying unsupervised learning/clustering is that this method requires teacher during the classification process and it has to learn independently which may lead to poor classification. Whereas for supervised learning method, it requires teacher or prior data (i.e. large, prohibitive and labelled training data) during classification process which in real life, the cost of obtaining sufficient labelled training data is high. In addition, the labelling is time consuming and done manually. To solve the problems mentioned, integration of unsupervised clustering algorithm and the supervised classifier is proposed. The objective of this research is to study the performance/capability of the integration between both unsupervised and supervised learning. In order to achieve the objective, this research is separated into two phases. Phase 1 is mainly to evaluate the performance of clustering algorithm (K-Means and FCM). Phase 2 is to study the performance of proposed integration system which using the data clustered to be used as train data for Naïve Bayes classifier. By adopting the proposed integration system, the limitation of the unsupervised clustering method can be overcome and for supervised learning, the labelling time can be reduced and more training examples are labelled which can be used to train for supervised classifier. As the result, the pattern classification accuracy is also increase. For examples, after applying the proposed integration system, the classification accuracy of Fisher’s Iris, Wine and Bacteria18Class has been increased from 88.67% to 96.00%, from 78.33% to 83.45% and from 93.33% to 94.67% respectively as compared to only used unsupervised clustering algorithm. The result has shown that the proposed integration system could be applied to increase the performance of the classification. However, further study is needed in the feature extraction and clustering algorithms part as the performance of the pattern classification is still depending on the data input.
Contributor(s):
Leong, Shi Xiang - Author
Primary Item Type:
Thesis
Language:
English
Subject Keywords:
Conference on computer vision and pattern recognition (CVPR) ; optical character readers (OCRs) biometrics ; pattern recognition system (PRS
Sponsor - Description:
Pusat Pengajian Kejuruteraan Elektrik & Elektronik -
First presented to the public:
8/1/2017
Original Publication Date:
4/23/2018
Previously Published By:
Universiti Sains Malaysia
Place Of Publication:
School of Electrical & Electronic Engineering
Citation:
Extents:
Number of Pages - 65
License Grantor / Date Granted:
  / ( View License )
Date Deposited
2018-04-23 13:04:40.332
Date Last Updated
2020-05-29 17:30:17.266
Submitter:
Mohd Fadli Abd. Rahman

All Versions

Thumbnail Name Version Created Date
Integration of unsupervised clustering algorithms and supervised classifiers for pattern recognition / Leong Shi Xiang1 2018-04-23 13:04:40.332