Hasil Pencarian  ::  Simpan CSV :: Kembali

Hasil Pencarian

Ditemukan 26 dokumen yang sesuai dengan query
cover
Mary, Leena
Abstrak :
This updated book expands upon prosody for recognition applications of speech processing. It includes importance of prosody for speech processing applications; builds on why prosody needs to be incorporated in speech processing applications; and presents methods for extraction and representation of prosody for applications such as speaker recognition, language recognition and speech recognition. The updated book also includes information on the significance of prosody for emotion recognition and various prosody-based approaches for automatic emotion recognition from speech.
Switzerland: Springer Cham, 2019
e20502221
eBooks  Universitas Indonesia Library
cover
Martin Hizkia Parasi
Abstrak :

Perkembangan teknologi pemrosesan ucapan sangat pesat akhir-akhir ini. Namun, fokus penelitian dalam Bahasa Indonesia masih terbilang sedikit, walaupun manfaat dan benefit yang dapat diperoleh sangat banyak dari pengembangan tersebut. Hal tersebut yang melatarbelakangi dilakukan penelitian ini. Pada penelitian ini digunakan model transfer learning (Inception dan ResNet) dan CNN untuk melakukan prediksi emosi terhadap suara manusia berbahasa Indonesia. Kumpulan data yang digunakan dalam penelitian ini, diperoleh dari berbagai film dalam Bahasa Indonesia. Film-film tersebut dipotong menjadi potongan yang lebih kecil dan dilakukan dua metode ekstraksi fitur dari potongan audio tersebut. Ekstraksi fitur yang digunakan adalah Mel-Spectrogram dan MelFrequency Cepstral Coefficient (MFCC). Data yang diperoleh dari kedua ekstraksi fitur tersebut dilatih pada tiga model yang digunakan (Inception, ResNet, serta CNN). Dari percobaan yang telah dilakukan, didapatkan bahwa model ResNet memiliki performa yang lebih baik dibanding Inception dan CNN, dengan rata-rata akurasi 49%. Pelatihan model menggunakan hyperparameter dengan batch size sebesar 16 dan dropout (0,2 untuk Mel-Spectrogram dan 0,4 untuk MFCC) demi mendapatkan performa terbaik.


Speech processing technology advancement has been snowballing for these several years. Nevertheless, research in the Indonesian language can be counted to be little compared to other technology research. Because of that, this research was done. In this research, the transfer learning models, focused on Inception and ResNet, were used to do the speech emotion recognition prediction based on human speech in the Indonesian language. The dataset that is used in this research was collected manually from several films and movies in Indonesian. The films were cut into several smaller parts and were extracted using the Mel-Spectrogram and Mel-frequency Cepstrum Coefficient (MFCC) feature extraction. The data, which is consist of the picture of Mel-spectrogram and MFCC, was trained on the models followed by testing. Based on the experiments done, the ResNet model has better accuracy and performance compared to the Inception and simple CNN, with 49% of accuracy. The experiments also showed that the best hyperparameter for this type of training is 16 batch size, 0.2 dropout sizes for Mel-spectrogram feature extraction, and 0.4 dropout sizes for MFCC to get the best performance out of the model used.

Depok: Fakultas Teknik Universitas Indonesia, 2022
S-Pdf
UI - Skripsi Membership  Universitas Indonesia Library
cover
Benesty, Jacob
Abstrak :
This work addresses this problem in the short-time Fourier transform (STFT) domain. We divide the general problem into five basic categories depending on the number of microphones being used and whether the interframe or interband correlation is considered. The first category deals with the single-channel problem where STFT coefficients at different frames and frequency bands are assumed to be independent. The second category also concerns the single-channel problem. The difference is that now the interframe correlation is taken into account and a filter is applied in each subband instead of just a gain. The third and fourth classes discuss the problem of multichannel noise reduction in the STFT domain with and without interframe correlation, respectively. In the last category, consider the interband correlation in the design of the noise reduction filters. Illustrate the basic principle for the single-channel case as an example, while this concept can be generalized to other scenarios. In all categories, propose different optimization cost functions from which derive the optimal filters and we also define the performance measures that help analyzing them.
Heidelberg : [, Springer], 2012
e20418134
eBooks  Universitas Indonesia Library
cover
Oliver Lemon, editor
Abstrak :
Data driven methods have long been used in Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) synthesis and have more recently been introduced for dialogue management, spoken language understanding, and Natural Language Generation. Machine learning is now present “end-to-end” in Spoken Dialogue Systems (SDS). However, these techniques require data collection and annotation campaigns, which can be time-consuming and expensive, as well as dataset expansion by simulation. In this book, we provide an overview of the current state of the field and of recent advances, with a specific focus on adaptivity.
New York: Springer-Science, 2012
e20407915
eBooks  Universitas Indonesia Library
cover
Zhuang, Jiao
Abstrak :
Distributed-order differential equations, a generalization of fractional calculus, are of increasing importance in many fields of science and engineering from the behaviour of complex dielectric media to the modelling of nonlinear systems. This brief will broaden the toolbox available to researchers interested in modeling, analysis, control and filtering. It contains contextual material outlining the progression from integer-order, through fractional-order to distributed-order systems. Stability issues are addressed with graphical and numerical results highlighting the fundamental differences between constant-, integer-, and distributed-order treatments. The power of the distributed-order model is demonstrated with work on the stability of noncommensurate-order linear time-invariant systems. Generic applications of the distributed-order operator follow : signal processing and viscoelastic damping of a mass–spring set up. A new general approach to discretization of distributed-order derivatives and integrals is described. The Brief is rounded out with a consideration of likely future research and applications and with a number of MATLAB® codes to reduce repetitive coding tasks and encourage new workers in distributed-order systems.
london: [, Springer], 2012
e20410774
eBooks  Universitas Indonesia Library
cover
Stuber, Gordon L.
Abstrak :
Principles of mobile communication, is an authoritative treatment of the fundamentals of mobile communications. This book stresses the "fundamentals" of physical-layer wireless and mobile communications engineering that are important for the design of "any" wireless system. This book differs from others in the field by stressing mathematical modeling and analysis. It includes many detailed derivations from first principles, extensive literature references, and provides a level of depth that is necessary for graduate students wishing to pursue research on this topic. The book's focus will benefit students taking formal instruction and practicing engineers who are likely to already have familiarity with the standards and are seeking to increase their knowledge of this important subject.
New York: Springer, 2011
e20421086
eBooks  Universitas Indonesia Library
cover
Abstrak :
This book presents fascinating, state-of-the-art research findings in the field of signal and image processing. It includes conference papers covering a wide range of signal processing applications involving filtering, encoding, classification, segmentation, clustering, feature extraction, denoising, watermarking, object recognition, reconstruction and fractal analysis. It addresses various types of signals, such as image, video, speech, non-speech audio, handwritten text, geometric diagram, ECG and EMG signals; MRI, PET and CT scan images; THz signals; solar wind speed signals (SWS); and photoplethysmogram (PPG) signals, and demonstrates how new paradigms of intelligent computing, like quantum computing, can be applied to process and analyze signals precisely and effectively. The book also discusses applications of hybrid methods, algorithms and image filters, which are proving to be better than the individual techniques or algorithms.
Singapore: Springer Nature, 2019
e20509893
eBooks  Universitas Indonesia Library
cover
Dib, Mohammed
Abstrak :
This book presents a contrastive linguistics study of Arabic and English for the dual purposes of improved language teaching and speech processing of Arabic via spectral analysis and neural networks. Contrastive linguistics is a field of linguistics which aims to compare the linguistic systems of two or more languages in order to ease the tasks of teaching, learning, and translation. The main focus of the present study is to treat the Arabic minimal syllable automatically to facilitate automatic speech processing in Arabic. It represents important reading for language learners and for linguists with an interest in Arabic and computational approaches.
Switzerland: Springer Nature, 2019
e20506958
eBooks  Universitas Indonesia Library
cover
Mary, Leena
Abstrak :
This book presents techniques for audio search, aimed to retrieve information from massive speech databases by using audio query words. The authors examine different features, techniques and evaluation measures attempted by researchers around the world. The topics covered also include available databases, software / tools, patents / copyrights, and different platforms for benchmarking. The content is relevant for developers, academics, and students.​
Switzerland: Springer Cham, 2019
e20502755
eBooks  Universitas Indonesia Library
cover
Jain, Kavindra R.
Abstrak :
This book provides an overview of computational approaches to medical image examination and analysis in oral radiology utilizing dental radiograph to detect and diagnose dental caries in cases of decayed teeth. Coverage includes basic image processing techniques; approaches for Region of Interest extraction and analysis; and the role of computational clustering techniques for segmentation of teeth and dental caries. The book also presents a novel multiphase level set method for automatic segmentation of dental radiographs.
Switzerland: Springer Nature, 2019
e20507675
eBooks  Universitas Indonesia Library
<<   1 2 3   >>