Hasil Pencarian

Ditemukan 67794 dokumen yang sesuai dengan query

Rilo Chandra Pradana

Deep Embedded Clustering untuk Pendeteksian Topik Tweet Berita Berbahasa Indonesia = Deep Embedded Clustering for Topic Detection on Indonesian News Tweet

Pendeteksian topik adalah teknik untuk memperoleh topik-topik yang dikandung oleh suatu data tekstual. Salah satu metode untuk pendeteksian topik yaitu dengan menggunakan clustering. Namun, secara umum metode clustering tidak menghasilkan cluster yang efektif bila dilakukan pada data yang berdimensi tinggi. Sehingga untuk memperoleh cluster yang efektif perlu dilakukan reduksi dimensi pada data sebelum dilakukan clustering pada ruang fitur yang berdimensi lebih rendah. Pada penelitian ini, digunakan suatu metode bernama Deep Embedded Clustering (DEC) untuk melakukan pendeteksian topik. Metode DEC bekerja untuk mengoptimasi ruang fitur dan cluster secara simultan. Metode DEC terdiri dari dua tahap. Tahap pertama terdiri dari pembelajaran autoencoder untuk memperoleh bobot dari encoder yang digunakan untuk mereduksi dimensi data dan k-means clustering untuk memperoleh centroid awal. Tahap kedua terdiri dari penghitungan soft assignment, penentuan distribusi bantuan untuk menggambarkan cluster di ruang data, dan dilanjutkan dengan backpropagation untuk memperbarui bobot encoder dan centroid. Dalam penelitian ini, dibangun dua macam model DEC yaitu DEC standar dan DEC without backpropagation. DEC without backpropagation adalah DEC yang menghilangkan proses backpropagation pada tahap kedua. Setiap model DEC pada penelitian ini akan menghasilkan topik-topik. Hasil tersebut dievaluasi dengan menggunakan coherence. Dari penelitian ini dapat dilihat bahwa model DEC without backpropagation lebih baik daripada DEC standar bila dilihat dari waktu komputasi dengan perbedaan coherence antara keduanya yang tidak terlalu jauh.

Topic detection is a technique for obtaining the topics that are contained in a textual data. One of the methods for topic detection is clustering. However, generally clustering does not produce an effective cluster when it is done by using data with high dimension. Therefore, to get an effective cluster, dimensionality reduction is needed before clustering in the lower dimensional feature space. In this research we use DEC method for topic detection. DEC method is used to optimize the feature space and cluster simultaneously. DEC is divided into two stages. The first stage consists of autoencoder learning that obtains the weights of the encoder that used for dimension reduction and k-means clustering to get the initial centroid. The second stage consists of the soft assignment calculation, computing the auxiliary distribution that represents the cluster in the data space, and backpropagation to update the encoder weights and the centroid. In this research, two DEC models are built, namely the standard DEC and DEC without backpropagation. DEC without backpropagation is the DEC which eliminate the backpropagation process in the second stage. Every DEC models will produce topics. The results are evaluated using the coherence measure. From this research, it can be seen that DEC without backpropagation is better than standard DEC in terms of computation time with a slight difference in coherence measure.

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Naufal Farhan

Metode Improved Deep Embedded Clustering untuk Pendeteksian Topik = Improved Deep Embedded Clustering Method for Topic Detection

Pendeteksian topik adalah suatu proses yang digunakan untuk menganalisis kata-kata pada suatu koleksi data tekstual untuk menentukan topik-topik yang ada pada koleksi tersebut. Salah satu metode standar yang digunakan untuk pendeteksian topik adalah metode clustering. Deep embedded clustering (DEC) adalah algoritma clustering dengan pendekatan deep learning yang menyatukan pembelajaran fitur dan clustering menjadi satu kerangka kerja sehingga dapat menghasilkan kinerja yang lebih baik. Namun metode DEC memiliki kelemahan, yaitu terjadinya penyimpangan ruang embedded ketika melakukan pembelajaran yang didapat ketika membuang decoder. Kelemahan tersebut diatasi dengan tidak membuang decoder, sehingga diperoleh metode yang lebih baik lagi yaitu Improved Deep Embedded Clustering (IDEC). Proses mempertahankan decoder disebut sebagai pelestarian struktur lokal. Pada penelitian ini, metode IDEC diadaptasi untuk masalah pendeteksian topik data tekstual berbahasa Indonesia. Selanjutnya kinerja metode IDEC dibandingkan dengan metode penelitian lain yang menggunakan DEC untuk masalah pendeteksian topik yaitu dengan cara membandingkan nilai dari coherence. Nilai coherence yang dihasilkan menunjukkan bahwa metode DEC lebih cocok jika dibandingkan dengan metode IDEC untuk permasalahan pendeteksian topik. Hal tersebut terjadi karena bagian decoder pada metode IDEC diperbarui sehingga parameter decoder sudah tidak sesuai untuk mengembalikan data ke dimensi semula. Sedangkan pada metode DEC bagian decoder dibuang sehingga parameter tidak diperbarui.

Topic detection is a process that is used to analyze words in a textual data collection to determine the topics within that collection. One of this standard topic detection method is clustering method. Deep embedded clustering (DEC) is a clustering algorithm with a deep learning approach that combines feature learning and clustering into one framework to obtain a better performance. However, DEC method has a weakness namely the distortion of embedded space that is caused by removing the decoder during the learning process. This weakness can be overcome by preserving the decoder, hence a better method is acquired, namely Improved Deep Embedded Clustering (IDEC). The process of preserving the decoder is called local structure preservation. In this research we adapt IDEC method for topic detection problem in Indonesian textual dataset. Furthermore, we compare the performance of IDEC method and other research using DEC by comparing the coherence value. The acquired coherence value shows that DEC method is more suitable compared to IDEC method for topic detection problems. This happens because of the decoder part in IDEC method is updated, so that the decoder parameters are no longer suitable to return the data into the original dimension. While in the DEC method the decoder was removed, therefore the parameters are not updated.

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Muktiari

Kernelisasi fuzzy c - means berbasis ruang eigen untuk pendeteksian topik pada berita online berbahasa Indonesia = Kernelization of eigenspaced ?? based fuzzy c - means for topic detection on Indonesian online news

"ABSTRAK

Pendeteksian topik adalah metode praktis untuk menemukan topik pada suatu koleksi dokumen. Salah satu metodenya adalah metode berbasis clustering yang mana centroid merepresentasikan topik contohnya eigenspace ndash; based fuzzy c ndash; EFCM . Proses clustering pada metode EFCM diimplementasikan pada dimensi yang lebih kecil yaitu ruang eigen. Sehingga akurasi dari proses clustering memungkinkan berkurang. Pada tesis ini, penulis menggunakan metode kernel sehingga proses clustering tersebut dapat diimplentasikan pada dimensi yang lebih tinggi tanpa mentransformasikan data ke ruang tersebut. Simulasi penulis menunjukkan bahwa kernelisasi ini meningkatkan akurasi dari EFCM berdasarkan skor interpretability pada berita online berbahasa Indonesia.

ABSTRACT

Topic detection is practical methods to find a topic in a collection of documents. One of the methods is a clustering based method whose centroids are interpreted as topics, i.e., eigenspace based fuzzy c means EFCM . The clustering process of the EFCM method is performed in a smaller dimensional Eigenspace. Thus, the accuracy of the clustering process may be reduced. In this thesis, we use the kernel method so that the clustering process is performed in a higher dimensional space without transforming data into that space. Author simulations show that this kernelization improves the accuracies of EFCM in term of interpretability scores for Indonesian online news."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2018

T50790

UI - Tesis Membership Universitas Indonesia Library

Robertus Agung Pradana

Fuzzy C-Means Clustering dengan Reduksi Dimensi Convolutional Autoencoder pada Pendeteksian Topik = Fuzzy C-Means Clustering with Convolutional Autoencoder Dimensional Reduction for Topic Detection

"Pendeteksian topik adalah suatu proses yang digunakan untuk menganalisis kata-kata pada suatu koleksi data tekstual untuk menentukan topik-topik yang ada pada koleksi tersebut, bagaimana hubungan topik-topik tersebut satu sama lainnya, dan bagaimana mereka berubah dari waktu ke waktu. Metod (FCM) merupakan metode yang sering digunakan pada masalah pendeteksian topik. FCM dapat mengelompokkan dataset ke beberapa kelompok dengan baik pada dataset dengan dimensi yang rendah, namun gagal pada dataset yang berdimensi tinggi. Untuk mengatasi permasalahan tersebut, dilakukan reduksi dimensi pada dataset sebelum dilakukan pendeteksian topik. Pada penelitian ini digunakan Convolutional Autoencoder dalam reduksi dimensi pada dataset. Oleh sebab itu, metode yang digunakan pada penelitian ini dalam pendeteksian topik adalah metode Convolutional-based Fuzzy C-Means (CFCM). Data yang digunakan dalam penelitian ini data coherence pada topik antara metode CFCM dengan satu convolutional layer (CFCM-1CL) dan metode CFCM dengan tiga convolutional layer (CFCM-3CL). Hasil penelitian ini menunjukkan bahwa nilai coherence dari metode CFCM-1CL lebih tinggi dibandingkan metode CFCM-3CL.

Topic detection is a process used to analyze words in a collection of textual data to determine the topics in the collection, how they relate to each other, and how they change from time to time. The Fuzzy C-Means (FCM) method is a clustering method that is often used in topic detection problems. Fuzzy C-Means can group dataset into multiple clusters on low-dimensional dataset, but fails on high-dimensional dataset. To overcome this problem, dimension reduction is carried out on the dataset before topic detection is carried out. In this study, Convolutional Autoencoder (CAE) is used in the reduction of dimensions in the dataset. Therefore, the method used in this research in topics detection is the Convolutional-based Fuzzy C-Means (CFCM) method. The data used in this study tweets national news account data on social media Twitter. CFCM method are divided into two stages, namely reducing the dataset dimension to a lower dimension using CAE and then clustering the dataset by using FCM to obtain topics. After the topics are obtained, an evaluation is done by calculating the value of coherence on the topics obtained. The study was conducted by comparing the coherence value on the topic between the CFCM method with one convolutional layer (CFCM-1CL) and the CFCM method with three convolutional layers (CFCM-3CL). The results of this study indicate that the coherence value of the CFCM-1CL method is higher than the CFCM-3CL method"

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Jauzak Hussaini Windiatmaja

Text Classification untuk Verifikasi Fakta pada Kanal Berita Bahasa Indonesia menggunakan Deep Learning dengan Ensemble Technique = Text Classification for Fact Verification for Indonesian News Channel Using Deep Learning with Ensemble Technique

"Sumber informasi di jejaring berita daring adalah instrumen yang memungkinkan individu membaca berita, menerbitkan berita, dan berkomunikasi. Hal ini sudah menjadi tren dalam masyarakat yang sangat mobile. Oleh karena itu, proses verifikasi fakta suatu pemberitaan menjadi sangat penting. Dengan pertimbangan tersebut, sebuah tools berbasis web service untuk verifikasi fakta menggunakan metode deep learning dengan teknik ensemble dibangun. Penggunaan teknik ensemble pada model deep learning adalah proses beberapa model pembelajaran mesin digabungkan secara strategis untuk menyelesaikan masalah menggunakan lebih dari satu model. Untuk melatih model, dibangun sebuah dataset. Dataset berisi pasangan klaim dan label. Klaim dibangun dengan data crawling di kanal berita berbahasa Indonesia. Tiga model deep learning dibangun dan dilatih menggunakan dataset yang dibuat, dengan arsitektur jaringan dan hyperparameter yang berbeda. Setelah model dilatih menggunakan dataset, ketiga model diagregasikan untuk membentuk sebuah model baru. Untuk memastikan bahwa model agregat berfungsi lebih baik daripada model tunggal, performa model deep learning ensemble dibandingkan dengan model deep learning dasar. Hasil penelitian menunjukkan bahwa model ensemble memiliki akurasi 85,18% sedangkan model tunggal memiliki akurasi 83,9%, 83,19%, dan 81,94%. Hasil ini menunjukkan bahwa model ensemble yang dibangun meningkatkan kinerja verifikasi fakta dari tiga model tunggal. Hasil penelitian juga menunjukkan bahwa metode deep learning mengungguli performa metode machine learning lain seperti naive bayes dan random forest. Untuk memvalidasi kinerja tools yang dibangun, response time dari web service diukur. Hasil pengukuran menunjukkan rata-rata response time 6.447,9 milidetik.
Information sources on social networks are instruments that allow individuals to read news, publish news, and communicate. This is a trend in a highly mobile society. Therefore, the process of verifying facts is very important. With these considerations, we built a web service-based tool for fact verification using deep learning methods with ensemble technique. The use of ensemble techniques in deep learning models is a process in which several machine learning models are combined to solve problems. To train the model, we created a dataset. Our dataset of Indonesian news contains pairs of claims along with labels. Claims are built by crawling data on Indonesian news channels. Three deep learning models have been built and trained using the previously created dataset with different network architectures and hyperparameters. After the model is trained, three models are aggregated to form a new model. To ensure that the aggregated model performs better than the single model, the deep learning ensemble model is compared to the single models. The results showed that the ensemble model has an accuracy of 85.18% while the single models have an accuracy of 83.9%, 83.19%, and 81.94% consecutively. These results indicate that the ensemble model built improves the fact-verification performance of the three single models. The results also show that by using the same dataset, deep learning methods outperform other machine learning methods such as naive bayes and random forest. To validate the performance of the tools we created, the response time of the web service is measured. The measurement result shows an average response time of 6447.9 milliseconds."

Depok: Fakultas Teknik Universitas Indonesia, 2021

T-Pdf

UI - Tesis Membership  Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Azmi Jundan Taqiy

Optimasi Rute Pelayaran Perintis di Wilayah NTT-Maluku Barat Daya Menggunakan Density-Based Spatial Clustering of Applications with Noise dan Travelling Salesman Problem = Optimization of Pelayaran Perintis in The NTT-Southwest Maluku Region Using Density-Based Spatial Clustering of Applications with Noise and Travelling Salesman Problem

"Indonesia sebagai negara kepulauan memiliki lebih dari 17 ribu pulau. Hal ini menyebabkan adanya tantangan tersendri untuk mewujudkan konektivitas antar pulaunya, terutama pada daerah terpencil dan tertinggal. Pelayaran perintis merupakan pelayaran yang disubsidi oleh pemerintah Indonesia dengan tujuan utama meningkatkan perekonomian di daerah terpencil dan tertinggal. Namun saat ini, kinerja pelayaran perintis masih belum optimal untuk mencapai tujuan tersebut. Hal tersebut ditandai dengan lamanya round voyage suatu trayek yang dapat mencapai 14 hari serta rendahnya capaian target voyage pelayaran perintis. Oleh karena itu, perlu adanya evaluasi serta efisiensi rute pelayaran perintis. Salah satu yang dapat dilakukan untuk meningkatkan efisiensi rute pelayaran perintis adalah dengan melakukan re-routing trayek pelayaran perintis. Penelitian ini melakukan re-routing pelayaran perintis di wilayah NTT-Maluku Barat Daya dengan pertama melakukan clustering menggunakan DBSCAN (Density-Based Spatial Clustering of Applications with Noise) serta optimasi dengan pendekatan TSP (Travelling Salesman Problem). Hasil yang didapatkan adalah terdapat pengurangan dari rata-rata jarak tempuh trayek pelayaran perintis sebesar 55% (dari 1276 NM menjadi 569,3 NM) serta pengurangan angka rata-rata lama round voyage trayek sebesar 74% (dari 13,3 hari menjadi 3,5 hari). Selain itu, terjadi penurunan ketimpangan antar trayeknya yang dilihat dari nilai jangkauan (range) dari jumlah pelabuhan, jarak tempuh, serta lama round voyage pada trayek pelayaran perintis di wilayah NTT-Maluku Barat Daya.

Indonesia, as an archipelagic country, has more than 17,000 islands. This causes challenges in realizing inter-island connectivity, especially in remote and underdeveloped areas. Pelayaran Perintis is a shipping program that the Indonesian government subsidizes to improve the economy in remote and underdeveloped areas. However, the performance of Pelayaran Perintis is still not optimal for achieving this goal. This is indicated by the length of the round voyage of a route that can reach 14 days and the low achievement of the Pelayaran Perintis voyage target. Therefore, there is a need for evaluation and efficiency of Pelayaran Perintis routes. One thing that can be done to increase the efficiency of Pelayaran Perintis routes is by re-routing Pelayaran Perintis routes. This study re-routes Pelayaran Perintis in the NTT-Maluku Southwest region by first clustering using DBSCAN (Density-Based Spatial Clustering of Applications with Noise) and optimization with the TSP (Travelling Salesman Problem) approach. The results obtained are a reduction in the average mileage for Pelayaran Perintis routes by 55% (from 1276 NM to 569.3 NM) and a reduction in the average length of round voyage routes by 74% (from 13.3 days to 3, 5 days). In addition, there has been a decrease in inequality between routes, which can be seen from the range value of the number of ports, distance traveled, and round voyage length on Pelayaran Perintis routes in the NTT-Southwest Maluku region."

Depok: Fakultas Teknik Universitas Indonesia, 2022

S-pdf

UI - Skripsi Membership  Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Web news documents clustering in indonesian language using singular value decomposition-principal component analysis and ant algorithms

"Ant-based document clustering is a cluster method of measuring text documents similarity based on the shortest path between nodes (trial phase) and determines the optimal clusters of sequence do-cument similarity (dividing phase). The processing time of trial phase Ant algorithms to make docu-ment vectors is very long because of high dimensional Document-Term Matrix (DTM). In this paper, we proposed a document clustering method for optimizing dimension reduction using Singular Value Decomposition-Principal Component Analysis (SVDPCA) and Ant algorithms. SVDPCA reduces size of the DTM dimensions by converting freq-term of conventional DTM to score-pc of Document-PC Matrix (DPCM). Ant algorithms creates documents clustering using the vector space model based on the dimension reduction result of DPCM. The experimental results on 506 news documents in Indo-nesian language demonstrated that the proposed method worked well to optimize dimension reduction up to 99.7%. We could speed up execution time efficiently of the trial phase and maintain the best F-measure achieved from experiments was 0.88 (88%).
Klasterisasi dokumen berbasis algoritma semut merupakan metode klaster yang mengukur kemiripan dokumen teks berdasarkan pencarian rute terpendek antar node (trial phase) dan menentukan sejumlah klaster yang optimal dari urutan kemiripan dokumen (dividing phase). Waktu proses trial phase algoritma semut dalam mengolah vektor dokumen tergolong lama sebagai akibat tingginya dimensi, karena adanya masalah sparseness pada matriks Document-Term Matrix (DTM). Oleh karena itu, penelitian ini mengusulkan sebuah metode klasterisasi dokumen yang mengoptimalkan reduksi dimensi menggunakan Singular Value Decomposition-Principal Component Analysis (SVDPCA) dan Algoritma Semut. SVDPCA mereduksi ukuran dimensi DTM dengan mengkonversi bentuk freq-term DTM konvensional ke dalam bentuk score-pc Document-PC Matrix (DPCM). Kemudian, Algoritma Semut melakukan klasterisasi dokumen menggunakan vector space model yang dibangun berdasarkan DPCM hasil reduksi dimensi. Hasil uji coba dari 506 dokumen berita berbahasa Indonesia membuk-tikan bahwa metode yang diusulkan bekerja dengan baik untuk mengoptimalkan reduksi dimensi hingga 99,7%, sehingga secara efisien mampu mempercepat waktu eksekusi trial phase algoritma se-mut namun tetap mempertahankan akurasi F-measure mencapai 0,88 (88%)."

Surabaya: Institut Teknologi Sepuluh Nopember, Faculty of Information Technology, Department of Informatics Engineering, 2016

AJ-Pdf

Artikel Jurnal  Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Latifah Al Haura

Deteksi Malicious Account pada Akun Twitter Indonesia Berbasis Tweet = Malicious Accounts Detection on Indonesian Twitter Acoounts using Tweets

"Penipuan dan bahkan pencurian informasi saat ini kerap terjadi di media sosial melalui unggahan pengguna yang tidak bertanggung jawab berupa status, tweet, ataupun pesan Spam yang berisi tautan-tautan yang berbahaya. Hal ini tidak terlepas dari keberadaan akun-akun jahat yang sudah sangat meresahkan dan mengganggu keamaan dan kenyamanan pengguna media sosial. Oleh karena itu, penelitian ini bertujuan untuk menggunakan fitur dari tweet (teks) dalam mendeteksi Malicious Account (akun jahat) di Twitter pengguna Indonesia. Terdapat dua metode ekstraksi fitur teks yang digunakan dan dibandingkan dalam penelitian ini yaitu Word2Vec dan FastText. Selain itu, penelitian ini juga membahas perbandingan antara metode Machine Learning dan Deep Learning dalam mengklasifikasi pengguna atau akun berdasarkan fitur dari tweet tersebut. Algoritma Machine Learning yang digunakan di antaranya adalah Logistic Regression, Decision Tree, dan Random Forest sedangkan algoritma Deep Learning yang digunakan yaitu Long Short-Term Memory (LSTM). Hasil dari keseluruhan skenario pengujian menunjukkan bahwa performa rata-rata yang dihasilkan metode ekstraksi fitur Word2Vec lebih unggul dibandingkan dengan FastText yang memiliki nilai F1-Score sebesar 74% dan metode klasifikasi Random Forest lebih unggul dibandingkan dengan tiga metode lainnya yang mana memiliki nilai F1-Score sebesar 82%. Sedangkan performa terbaik untuk kombinasi antara metode ekstraksi fitur dan metode klasifikasi terbaik yaitu gabungan antara Pre-trained Word2Vec dan LSTM dengan nilai F1-Score sebesar 84%.
Fraud and even theft of information nowadays often occur on social media through irresponsible user uploads in the form of statuses, tweets, or spam messages containing dangerous links. This is inseparable from the existence of Malicious Accounts that have been very disturbing and disturbing the comfort of users and the comfort of social media users. Therefore, this study aims to use the feature of tweets (text) in detecting Malicious Accounts on Indonesian Twitter users. There are two text feature extraction methods used and compared in this study, namely Word2Vec and FastText. In addition, this study also discusses the comparison between Machine Learning and Deep Learning methods in classifying users or accounts based on the features of the tweet. The Machine Learning algorithm used is Logistic Regression, Decision Tree, and Random Forest, while the Deep Learning algorithm used is Long Short-Term Memory (LSTM). The results of all test scenarios show that the average performance of the Word2Vec feature extraction method is higher than FastText with an F1-Score value of 74% and the Random Forest classification method is higher than the other three methods which have an F1-Score value of 82%. While the best performance for the combination of feature extraction method and the best classification method is the combination of Pre-trained Word2Vec and LSTM with an F1-Score value of 84%."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2021

T-pdf

UI - Tesis Membership  Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Natasha Rosaline

Fuzzy C-Means Clustering dengan Reduksi Dimensi Deep Autoencoders untuk Pendeteksian Topik pada Data Tekstual Twitter = Fuzzy C-Means Clustering with Deep Autoencoders Dimensional Reduction for Topic Detection on Textual Data from Twitter

"Pendeteksian topik merupakan suatu teknik untuk memperoleh informasi dengan cara mengekstrak topik-topik dari kumpulan data yang sangat besar. Salah satu metode yang digunakan untuk pendeteksian topik adalah metode clustering, yaitu Fuzzy C-Means (FCM). Namun, kinerja dari FCM menjadi buruk saat harus melakukan clustering pada data yang berdimensi tinggi. Kelemahan dari FCM tersebut dapat ditanggulangi dengan cara melakukan reduksi dimensi. Pada penelitian ini, digunakan suatu metode deep learning, yaitu Deep Autoencoders (DAE), untuk mereduksi dimensi dari kumpulan data. Metode FCM clustering dengan reduksi dimensi DAE ini disebut Deep Autoencoders-Based Fuzzy C-Means (DFCM). Metode DFCM dibagi menjadi dua tahapan, yakni mereduksi dimensi kumpulan data yang berdimensi tinggi menggunakan Deep Autoencoders, dan melakukan FCM clustering pada data yang telah direduksi. Hasil dari metode DFCM adalah topik-topik. Topik-topik tersebut dievaluasi menggunakan nilai coherence. Pada penelitian ini, dibangun dua metode DFCM, yaitu FCM berbasis DAE dengan satu lapisan tersembunyi (DFCM-single hidden layer) dan FCM berbasis DAE dengan multi lapisan tersembunyi (DFCM-multi hidden layers). Hasil dari kedua metode ini menunjukkan bahwa topik-topik pada DFCM-single hidden layer memiliki nilai coherence lebih tinggi dari topik-topik pada DFCM-multi hidden layers.
Topic detection is a technique to find out information by extracting topics from big data. One method used for topic detection is the clustering method, namely Fuzzy C-Means (FCM). However, the performance of FCM becomes worse when clustering on highdimensional data. That weakness is resolved by dimensional reduction. In this research, deep learning method is used to reduce the dimensions of the data set, namely Deep Autoencoders (DAE). FCM clustering method with DAE dimensional reduction is called Deep Autoencoders-Based Fuzzy C-Means (DFCM). DFCM is divided into two parts. First, reducing the dimensions of high-dimensional data collection using Deep Autoencoders. Second, performing FCM clustering on the reduced data. Results of DFCM are topics. These topics are evaluated using the value of coherence. In this research, two DFCM methods were built, namely DAE with one hidden layer based FCM (DFCM-single hidden layer) and DAE with multi-hidden layers based FCM (DFCMmulti hidden layers). The results of these two methods show that the topics in DFCMsingle hidden layer have a higher coherence value than the topics in DFCM-multi hidden layers."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

S-pdf

UI - Skripsi Membership  Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Malvin Edward

Rancang bangun sistem embedded pendeteksi kantuk untuk pengemudi berbasis openCV = Development of drowsiness detection embedded system for driver openCV based

"Sebuah survei yang dilakukan oleh National Highway Traffic Safety Administration NHTSA memperkirakan 5.895.000 kasus kecelakaan yang terkait dengan permasalahan kantuk maupun tidur saat berkendara di jalan jalan U.S.A pada tahun 2005-2009. Dari jumlah tersebut, 83.000 kasus setiap tahunnya merupakan kecelakaan fatal, bahkan pada tahun 2014, 846 orang meningga pada kecelakaan berkendara yang berkaitan dengan kantuk. Sistem pendeteksi kantuk dikembangkan untuk mengatasi hal ini. Sistem pendeteksi kantuk dibangun menggunakan pustaka OpenCV, dengan kombinasi dari beberapa algoritma, yaitu Haar Cascade Classifier, fungsi Blur, fungsi Canny dan fungsi Kontur. Algoritma Haar Cascade Classifier digunakan untuk mendeteksi area wajah dan area mata pada pengemudi. Sedangkan kombinasi antara fungsi color thresholding dan fungsi kontur digunakan untuk mendeteksi objek mata dan menganalisis sedang terbuka atau tertutupnya mata.
Kinerja sistem deteksi kantuk diuji melalui empat variabel, yaitu mesin pengolah yang berbeda, nilai ambang batas, kondisi pencahayaan dan karakteristik mata yang berbeda. Berdasarkan hasil pengujian, nilai ambang batas Vlo dan VHi terbaik adalah Vlo = 10 atau 20 dengan perbedaan VHI 10-20. Selain itu, ditemukan bahwa setiap kecepatan setiap proses bergantung pada pengolahan mesh dimana semakin baik pengolahannya. Mesin semakin cepat waktu prosesnya. Perbedaan dalam kondisi pencahayaan pagi, siang, siang dan malam berpengaruh terhadap kinerja sistem deteksi kantuk dengan tingkat kesalahan 20 , yaitu saat kondisi malam hari. Karakteristik mata berkacamata dan tanpa kacamata berpengaruh pada kinerja sistem deteksi kantuk dengan deteksi 100 tingkat keberhasilan, yaitu bila kondisi mata tertutup pada orang dengan kacamata.
survey conducted by the National Highway Traffic Safety Administration NHTSA estimates 5,895,000 cases of accidents related to sleepiness and sleep problems while driving on the U.S.A roadway in 2005 2009. Of these, 83,000 cases each year are fatal accidents, even by 2014, 846 people die in a dormant driving accident. The drowsiness detection system was developed to overcome this. The sleepiness detection system is built using the OpenCV library, with a combination of several algorithms, the Haar Cascade Classifier, the Blur function, the Canny function and the Contour function. Haar Cascade Classifier algorithm is used to detect the facial area and eye area of the driver. While the combination of color thresholding function and contour function is used to detect the eye object and analyze the open or closed eyes.
The performance of the drowsiness detection system is tested through four variables, ie different processing machines, threshold values, lighting conditions and different eye characteristics. Based on the test results, the best Vlo and VHi threshold values are Vlo 10 or 20 with a VHI difference of 10 20. In addition, it was found that every speed of each process depends on mesh processing where the better the processing. The faster the machine the process time. Differences in lighting conditions morning, noon, day and night affect the performance of the drowsiness detection system with a 20 error rate, ie during nighttime conditions. Eye characteristics bespectacled and without glasses affect the performance of the drowsiness detection system with a 100 detection rate of success, ie when eye conditions are closed in people with glasses."

Depok: Fakultas Teknik Universitas Indonesia, 2017

S68630

UI - Skripsi Membership  Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian