Pada masa kini, sistem pengecam pertuturan produk komersial adalah berdasarkan pada ‘cloud’ pengecaman pertuturan. Bagaimana ia berfungsi ialah apabila peranti diaktifkan dan dikesan ucapan, ucapan akan ditangkap, digital dan dimampatkan ke dalam fail bunyi dan dihantar ke satu tempat, di mana ianya diproses dengan perisian pengiktirafan ucapan , ditukar kepada teks , dan kemudian dihantar balik ke peranti melalui internet. Akses adalah isu utama dalam kualiti produk akhir adalah bergantung kepada kualiti isyarat yang diterimanya. Dalam projek ini, pelaksanaan algoritma untuk implementasi sistem pengecaman pertuturan telah disiasat. Suara manusia akan direkodkan dengan menggunakan mikrofon yang disambungkan kepada ‘Field Programmable Gate Array’ (FPGA) Altera Development and Education board 2. Wolfson WM8731 kodeks akan menukar ucapan yang diterima menjadi bentuk isyarat digital untuk menghasilkan data digital yang mewakili setiap isyarat daripada setiap langkah masa diskret. Dalam kajian ini, isyarat input adalah dalam segi suara daripada nombor 0 hingga 9. Ucapan digital yang disampelkan akan diproses menggunakan Mel Frequency Cepstral Coefficient (MFCC) untuk mendapat ciri-ciri suara. Selepas itu, ciri-ciri suara dilatih menggunakan MATLAB Artificial Neuron Network (ANN) model. Setelah berjaya mendapatkan ketepatan latihan 93% dari MATLAB , model ANN telah dilaksanakan dalam NIOS II dan diuji untuk mengenali nombor 0 hingga 9 dari segi pertuturan. Walau bagaimanapun disebabkan oleh beberapa had ketepatan hanya 40% daripada data output dibentangkan dalam bentuk teks yang sepadan dengan data input.
___________________________________________________________________________________
Currently speech recognition over commercial devices are using cloud based speech recognition. How does it works is basically when the device is activated and speech detected, the spoken speech are captured, digitized and compressed into a wave file and sent to somewhere, where it is processed using speech recognition software, converted to text, and then sent back to device via the internet. The accessibility is the main issue where the quality of the end product is dependent upon the quality of the signal it receives. In this project implementation, the hardware of an algorithm for Speech Recognition System is investigated. Human voice is recorded by using microphone that connected to field programmable gate array (FPGA) Altera Development and Education board 2. Wolfson WM8731 CODEC on Development and Education board 2 converts the speech received into digital signal form to produce digital data representing each level of signal at every discrete time step. In this study, input signals are in term of voice from number 0 to 9. The digitized speech samples are then processed using Mel Frequency Cepstral Coefficient (MFCC) to obtain the features of the voice input. After that, the voice features are trained using MATLAB Artificial Neuron Network (ANN) model. Upon successfully obtain 93% training accuracy from MATLAB, the developed ANN model implemented in NIOS II processor and tested to recognize the number 0 to 9 in term of speech. However, due to some limitation, only 40% accuracy of output data are presented in text form corresponding to the input data.