Hasil Pencarian

Ditemukan 3 dokumen yang sesuai dengan query

Bayu Satria Persada

Perbandingan Depthwise Separable Convolutional Neural Network dan Convolutional Neural Network Sebagai Multiclass Keyword Spotting Pada Edge Device = Comparison of Depthwise Separable Convolutional Neural Network and Convolutional Neural Network As Multiclass Keyword Spotting On Edge Devices

"Perkembangan Artificial Intelligence (AI) sudah berkembang pesat. Dari ketiga arah pengembangan AI yakni computer vision, speech processing dan natural language processing. Speech processing memiliki tren paling rendah di antara ketiga pengembangan tersebut. Meskipun begitu pengembangan di bidang speech processing seperti speech recognition dan keyword spotting sudah banyak di implementasikan seperti model keyword spotting menggunakan Convolutional Neural Network (CNN) di microcontroller, mobile device dan perangkat lainnya. Namun CNN saja belum tentu menghasilkan akurasi yang tinggi maka dicoba Depthwise Separable Convolutional Neural Network (DSCNN) untuk mendapatkan hasil dengan akurasi yang lebih tinggi. Pengembangan model keyword spotting belum banyak diimplementasikan di edge device lainnya, yang dimaksud dengan edge device yaitu perangkat sederhana di sisi pengguna yang kemampuan komputasinya terbatas. Dengan menggunakan DSCNN menunjukkan nilai F1 score yang dibandingkan dengan model CNN. Model DSCNN menghasilkan model dengan nilai F1 score paling optimal dengan 4 layer konvolusi depthwise separable, menggunakan filter konvolusi sebanyak 256 dengan jumlah filter konvolusi depthwise 512 menggunakan optimizer RMSprop dan menggunakan batch size berukuran 126. Dari hasil pengujian dapat diketahui bahwa secara umum DSCNN menghasilkan F1 score yang lebih baik dibandingkan CNN yaitu sebesar 31,8% dengan CNN sebesar 28,35%. Namun DSCNN menggunakan sumber daya yang lebih banyak dan lebih lama waktu responsnya.

The development of Artificial Intelligence (AI) has grown rapidly. Of the three directions of AI development, namely computer vision, speech processing, and natural language processing. Speech processing has the lowest trend among the three developments. However, many developments in speech processing such as speech recognition and keyword spotting have been implemented, such as the keyword spotting model using the Convolutional Neural Network (CNN) in microcontrollers, mobile devices, and other devices. However, CNN alone does not necessarily produce high accuracy, so a Depthwise Separable Convolutional Neural Network (DSCNN) is used to get results with higher accuracy. The development of the keyword spotting model has not been widely implemented in other edge devices, which is meant by edge devices, namely simple devices on the user's side with limited computing capabilities. Using DSCNN shows the F1 score which is compared with the CNN model. The DSCNN model produces a model with the most optimal F1 score with 4 layers of convolution depthwise separable, using a convolution filter of 256 with a convolution depthwise filter of 512 using the RMSprop optimizer and using a batch size of 126. From the test results, in general DSCNN produces F1 score which is better than CNN, which is 31,8% with CNN at 28,35%. However, DSCNN uses more resources and a longer response time."

Depok: Fakultas Teknik Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Gita Ayu Salsabila

Perbandingan Convolutional Neural Network dan Convolutional Recurrent Neural Network sebagai Model Multiclass Keyword Spotting pada Edge Device = Convolutional Neural Network and Convolutional Recurrent Neural Network Comparison as Multiclass Keyword Spotting Model on Edge Device

"Selama masa pandemi COVID-19, antarmuka suara menggunakan KWS (keyword spotting) semakin sering digunakan pada berbagai sistem elektronik karena minimnya kontak fisik yang diperlukan antarmuka ini. Salah satu sistem yang dapat menggunakan KWS adalah sistem navigasi lift, di mana KWS pada sistem tersebut akan mengenali kata kunci terkait lantai yang ingin dituju pengguna. Dalam penelitian ini, model KWS untuk sistem navigasi lift dibuat menggunakan CNN (Convolutional Neural Network) dan CRNN (Convolutional Recurrent Neural Network) untuk mengenali enam kata kunci spesifik. Selama proses pembuatannya, berbagai hyperparameter CRNN terkait implementasi GRU, batch normalization, dropout layer, optimizer, kernel size, dan batch size diuji pengaruh variasinya terhadap performa CRNN. Dari pengujian tersebut, ditemukan bahwa CRNN menunjukkan performa paling baik ketika GRU yang digunakan bersifat bidirectional dengan dua layer dan 64 hidden unit, kernel size sebesar 3x3, optimizer Adams, batch size sebesar 163, serta penerapan batch normalization layer sebelum dropout layer. Model CRNN yang diperoleh dari kombinasi hyperparameter terbaik kemudian dibandingkan dengan model CNN untuk dievaluasi performa klasifikasinya saat dijalankan pada Raspberry Pi 4B. Berdasarkan hasil akurasi, persentase penggunaan RAM, dan latensi, model CNN menunjukkan performa yang lebih baik daripada CRNN.

During the COVID-19 pandemic, voice interfaces using KWS (keyword spotting) are increasingly being used in various electronic systems due to the lack of physical contact required for this interface. One system that can use KWS is an elevator navigation system, where the KWS on the system will recognize keywords related to the floor the user wants to go to. In this study, the KWS model for the elevator navigation system was created using CNN (Convolutional Neural Network) and CRNN (Convolutional Recurrent Neural Network) to identify six specific keywords. During the manufacturing process, various CRNN hyperparameters related to GRU implementation, batch normalization, dropout layer, optimizer, kernel size, and batch size were tested for the effect of their variations on CRNN performance. From these tests, it was found that CRNN showed the best performance when the GRU used bidirectional with two layers and 64 hidden units, kernel size of 3x3, Adams optimizer, batch size of 163, and batch normalization layer applied before dropout layer. The CRNN model obtained from the best combination of hyperparameters is then compared with the CNN model to evaluate its classification performance when run on the Raspberry Pi 4B. Based on the results of accuracy, percentage of RAM usage, and latency, CNN model shows better performance than CRNN."

Depok: Fakultas Teknik Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Anandwi Ghurran Muhajjalin Arreto

Perbandingan convolutional neural network dan multihead attention dengan recurrent neural network sebagai multiclass keyword spotting pada edge device = Comparison of convolutional neural network and multihead attention with recurrent neural network as multiclass keyword spotting on edge devices.

"Artificial Intelligence (AI) telah berkembang sangat pesat sehingga sudah sering terlihat dan digunakan secara umum oleh masyarakat. Salah satu jenis AI yang sering digunakan adalah speech recognition terutama keyword spotting yang disebabkan karena pandemi COVID-19. Implementasi keyword spotting dapat diterapkan pada lift sebagai sistem navigasi agar para pengguna lift tidak perlu melakukan kontak pada tombol, melainkan dapat menggerakkan lift hanya dengan mengucapkan lantai yang dituju. Metode untuk melakukan implementasi keyword spotting pada sistem lift dapat dilakukan dengan banyak metode, namun pada skripsi ini, metode yang diujikan adalah CNN (Convolutional Neural Network) dan MHAtt RNN (Multihead Attention Recurrent Neural Network). Penelitian yang dilakukan memiliki batasan untuk setiap metode agar dapat melakukan klasifikasi enam keyword dan melihat performa kedua metode dalam berbagai skenario yang dapat terjadi dalam lift. Dalam pembentukan model dari MHAtt RNN, dapat diketahui bahwa model memiliki performa terbaik ketika dibentuk dengan jumlah head untuk attention sebesar 8 dan LSTM dengan jumlah unit sebanyak 32. Pelatihan pada model dilakukan menggunakan optimizer Adam dengan learning rate sebesar 0.001 dan decay 0.005 agar pelatihan dapat menghasilkan model yang paling baik. Setelah melakukan pengujian pada berbagai skenario yang dapat terjadi di dalam sebuah lift, didapatkan hasil bahwa secara keseluruhan model CNN memiliki performa yang lebih baik dibandingkan model MHAtt RNN karena memiliki nilai F1-score dan precision yang lebih tinggi.

Artificial Intelligence (AI) has grown so rapidly that it has often been seen and used in general by the public. One type of AI that is often used is speech recognition, especially keyword spotting caused by the COVID-19 pandemic. The implementation of keyword spotting can be applied to elevators as a navigation system so that elevator users do not need to make contact with buttons but can move the elevator just by saying the intended floor. There are many methods to implement keyword spotting in elevator systems, but in this thesis, the methods tested are CNN (Convolutional Neural Network) and MHAtt RNN (Multihead Attention Recurrent Neural Network). The research conducted has limitations for each method in order to be able to classify six keywords and see the performance of both methods in various scenarios that can occur in an elevator. In forming the model from MHAtt RNN, it can be seen that the model has the best performance when it is formed with the number of heads for attention of 8 and the LSTM with the number of units of 32. The training on the model is carried out using the Adam optimizer with a learning rate of 0.001 and a decay of 0.005 so that the training can produce the best models. After testing on various scenarios that can occur in an elevator, the results show that the CNN model overall has better performance than the MHAtt RNN model because it has a higher F1-score and precision."

Depok: Fakultas Teknik Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian