Hasil Pencarian

Ditemukan 154778 dokumen yang sesuai dengan query

Muhammad Zaki Nur Said Hanan

Penerapan Knowledge Distillation Berbasis DeepSeek untuk Klasifikasi Website Ilegal: Studi Komparatif SVM, Random Forest, dan Naive Bayes = Application of DeepSeek-Based Knowledge Distillation forI llegal Website Classification: A Comparative Study of SVM, Random Forest, and Naive Bayes

"Pertumbuhan pesat penggunaan internet telah menyebabkan meningkatnya jumlah situs web ilegal yang menyebarkan konten berbahaya, penipuan, atau tanpa izin. Penelitian ini mengembangkan sistem klasifikasi situs web ilegal dengan memanfaatkan teknik knowledge distillation berbasis DeepSeek, di mana model besar (teacher) digunakan untuk auto-labeling pada dataset yang dibangun sendiri. Data berlabel hasil distilasi digunakan untuk melatih model student yang lebih ringan. Tiga algoritma—Support Vector Machine (SVM), Random Forest (RF), dan Naive Bayes (NB)—dibandingkan performanya menggunakan proses tokenisasi, penghapusan stopword, vektorisasi TF-IDF, dan oversampling untuk mengatasi ketidakseimbangan kelas. Hasil eksperimen menunjukkan bahwa Random Forest memberikan akurasi tertinggi, yakni 97,2%, dengan F1-score makro rata-rata 0,97, sementara Naive Bayes lebih unggul dalam kecepatan pemrosesan meski presisinya lebih rendah. Studi ini menegaskan efektivitas kombinasi knowledge distillation dan algoritma pembelajaran mesin konvensional untuk klasifikasi situs web ilegal serta dapat menjadi dasar pengembangan sistem penyaringan konten web di masa mendatang.

The rapid growth of internet usage has led to an increasing number of illegal websites spreading harmful, fraudulent, or unauthorized content. This study develops an illegal website classification system using a knowledge distillation technique based on the DeepSeek model, where a large teacher model is utilized for autolabeling on a custom-built dataset. The labeled data from distillation is then used to train a lighter student model. The performance of three algorithms—Support Vector Machine (SVM), Random Forest (RF), and Naive Bayes (NB)—is compared, employing tokenization, stopword removal, TF-IDF vectorization, and oversampling to address class imbalance. Experimental results show that Random Forest achieves the highest accuracy at 97.2% with a macro-average F1-score of 0.97, while Naive Bayes excels in processing speed despite lower precision. This study demonstrates the effectiveness of combining knowledge distillation with conventional machine learning algorithms for illegal website classification and provides a solid foundation for future web content filtering systems."

Depok: Fakultas Teknik Universitas Indonesia, 2025

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Rofif Zainul Muttaqin

Rancang Bangun Realtime Web Application Firewall berbasis Deep Learning untuk Peningkatan Keamanan Aplikasi Web-Studi Kasus Badan Meteorologi, Klimatologi dan Geofisika = Development of Realtime Web Application Firewall based on Deep Learning to Improve Web Application Security-Case Study of the Meteorology, Climatology and Geophysics Agency

"Perkembangan teknologi yang terus berkembang mendorong penggunaan aplikasi web di berbagai layanan, namun terdapat berbagai kerentanan pada aplikasi web yang setiap saat dapat dimanfaatkan penyerang untuk melakukan serangan. Untuk menanggulangi hal ini, salah satu upaya yang dapat dilakukan ialah menerapkan Web Application Firewall (WAF) yang dapat melindungi aplikasi web. WAF umumnya bekerja berdasarkan aturan yang ditetapkan sebelumnya. Namun kelemahan sistem ini ialah serangan yang terus berkembang, serta dalam mengkonfigurasi aturan pada WAF, diperlukan pengetahuan mendalam terkait aplikasi yang ada. Teknologi kecerdasan buatan, baik machine learning (ML) atau deep learning (DL) memperlihatkan potensi yang baik dalam mengenali jenis serangan. Di dalam penelitian ini dibangun sebuah Real-time DL-based WAF untuk meningkatkan keamanan pada aplikasi web. Berbagai model ML dan DL diujicoba untuk melakukan tugas deteksi serangan web, mulai dari Support Vector Machine (SVM), Random Forest (RF), Convolutional Neural Network (CNN), dan Long Short-Term Memory (LSTM). Berdasarkan hasil pengujian, model CNN-LSTM meraih performa tertinggi yakni akurasi sebesar 98.61 %, presisi sebesar 99%, recall sebesar 98.08% dan f1-score sebesar 98.54%.. Dari hasil pengujian dengan web vulnerability scanner, performa DL-based WAF tidak kalah dengan ModSecurity WAF yang dijadikan sebagai pembanding. Dari hasil analisis, dapat disimpulkan bahwa penerapan DL-based WAF mampu meningkatkan keamanan pada aplikasi web.

The continuous development of technology drives the use of web applications in various services, but there are various vulnerabilities in web applications that can be exploited by attackers at any time. To overcome this, one effort that can be done is to implement a Web Application Firewall (WAF) that can protect web applications. WAF generally works based on pre-established rules. However, the weakness of this system is the evolving nature of attacks, and configuring rules on WAF requires in-depth knowledge related to existing applications. Artificial intelligence technology, both machine learning (ML) and deep learning (DL), shows good potential in recognizing types of attacks. In this research, a Real-time DL-based WAF was built to enhance security in web applications. Various ML and DL models were tested to perform the task of web attack detection, including Support Vector Machine (SVM), Random Forest (RF), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM). Based on the test results, the CNN-LSTM model achieved the highest performance, namely an accuracy of 98.61%, precision of 99%, recall of 98.08%, and f1-score of 98.54%. From the testing results with a web vulnerability scanner, the performance of the DL-based WAF is not inferior to ModSecurity WAF, which is used as a comparison. From the analysis results, it can be concluded that the implementation of DL-based WAF can improve the security of web applications. "

Jakarta: Fakultas Teknik Universitas Indonesia, 2024

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Bijak Rabbani

Residual neural network dan persistent homology untuk klasifikasi diabetik retinopati = Residual neural network and persistent homology for the classification of diabetic retinopathy

"Diabetik retinopati adalah komplikasi dari penyakit diabetes yang dapat mengakibatkan gangguan penglihatan bahkan kebutaan. Penyakit ini menjadi tidak dapat disembuhkan jika telah melewati fase tertentu, sehingga diagnosa sedini mungkin menjadi sangat penting. Namun, diagnosa oleh dokter mata memerlukan biaya dan waktu yang cukup besar. Oleh karena itu, telah dilakukan upaya untuk meningkatkan efisiensi kerja dokter mata dengan bantuan komputer. Deep learning merupakan sebuah metode yang banyak digunakan untuk menyelesaikan masalah ini. Salah satu arsitektur deep learning yang memiliki performa terbaik adalah residual network. Metode ini memiliki kelebihan dalam menghindari masalah degradasi akurasi, sehingga memungkinkan penggunaan jaringan yang dalam. Di sisi lain, metode persistent homology juga telah banyak berkembang dan diaplikasikan pada berbagai masalah. Metode ini berfokus pada informasi topologi yang terdapat pada data. Informasi topologi ini berbeda dengan representasi data yang didapatkan dari model residual network. Penelitian ini melakukan analisis terhadap penerapan persistent homology pada kerangka kerja residual network dalam permasalahan klasifikasi diabetik retinopati. Dalam studi ini, dilakukan eksperimen berkaitan dengan informasi topologi dan proses pengolahannya. Informasi topologi ini direpresentasikan dengan betti curve atau persistence image. Sementara itu, pada proses pengolahannya dilakukan ujicoba pada kanal citra, metode normalisasi, dan layer tambahan. Hasil eksperimen yang telah dilakukan adalah penerapan persistent homology pada kerangka kerja residual network dapat meningkatkan hasil klasifikasi penyakit diabetik retinopati. Selain itu, penggunaan betti curve dari kanal merah sebuah citra sebagai representasi informasi topologi memberikan hasil terbaik dengan skor kappa 0.829 pada data test.

Diabetic retinopathy is a complication of diabetes which can result in visual disturbance and even blindness. This disease becomes incurable after reaching certain phases, thus immidiate diagnosis is highly important. However, diagnosis by a professional ophthalmologist requires a great amount of time and cost. Therefore, efforts to increase the work efficiency of ophthalmologists using computer system has been done. Deep learning is a method that widely used to solve this particular problem. Residual network is one of deep learning architecture which has the best performance. The main advantage of residual network is its ability to prevent accuracy degradation, thus enabling the model to go deeper. On the other hand, persistent homology is also rapidly developing and applied in various fields. This method focus on the topological information of the data. This information are different with the data representation that extracted by neural network model. This study analyze the incorporation of persistent homology to residual networks framework for diabetic retinopati classification. In this study, experiments regarding about topological information and its process were carried out. The topological information is represented as betti curve or persistence image. Meanwhile, the experiments are analyzing the impact of image colour channel, normalization method, and additional layer. According to the experiments, application of persistent homology on residual network framework could improve the outcome of diabetic retinopathy classification. Moreover, the application of betti curve from the red channel as a representation of topological information has the best outcome with kappa score of 0.829."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2021

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Shafira Nur Amalia

Imputasi Missing Values menggunakan algoritma Hybrid Fuzzy CMeans dan Majority Vote untuk Klasifikasi Data Penyakit Paru Obstruktif Kronik (PPOK) = Missing Values Imputation using Hybrid Fuzzy C-Means and Majority Vote Algorithm for Chronic Obstructive Pulmonary Disease (COPD) Data Classification

"Dalam suatu penelitian, dibutuhkan data yang dikumpulkan dan diolah untuk memecahkan permasalahan dan membuktikan hipotesis dalam penelitian. Namun, seringkali data yang diperoleh tidak menyimpan nilai untuk suatu variabel pada observasi yang diharapkan. Data yang tidak tersimpan menyebabkan data penelitian kosong dan berdampak pada penelitian. Jika peristiwa ini terjadi, maka penelitian terindikasi memiliki missing data atau missing values. Salah satu cara untuk mengatasi missing values yaitu dengan imputasi. Imputasi bekerja dengan mengisi nilai pada missing values dengan suatu nilai estimasi yang telah dianalisis dan diputuskan untuk membuat suatu dataset lengkap. Dalam proses imputasi, seringkali ditemukan bahwa data yang digunakan untuk imputasi terkadang memiliki karakteristik yang tidak jelas atau tidak konsisten, maka salah satu solusinya adalah dengan menggunakan metode Fuzzy C-Means (FCM). Estimasi nilai-nilai missing values menggunakan model FCM menghasilkan model prediksi dengan variasi parameter yang beragam sehingga dibutuhkan pendekatan lain untuk menghasilkan model terbaik dengan parameter yang optimal. Hal inilah yang mendasari diperlukannya suatu pendekatan hybrid, yaitu dengan menggabungkan beberapa model machine learning untuk memperoleh hasil estimasi missing values terbaik. Pada penelitian ini, dilakukan implementasi Hybrid Fuzzy C-Means dan Majority Vote (Hybrid FCMMV) pada data Penyakit Paru Obstruktif Kronik (PPOK) tahun 2012-2017 yang diperoleh dari Rumah Sakit Cipto Mangunkusumo (RSCM) untuk memberikan performa imputasi yang lebih baik berdasarkan akurasi, presisi, recall, dan F-Score melalui klasifikasi metode ensemble Random Forest.

In a research study, collected and processed data are needed to solve problems and prove hypotheses. However, the obtained data often do not store the value for a variable in the expected observation. Data that are not stored contribute to the emptying of research data which has an impact on the research itself. If the phenomenon occurs, it indicates that the research has missing data or missing values. One way to overcome missing values is using imputation techniques. The technique works by filling in the missing values with an estimated value that has been analyzed and decided to create a complete dataset. In the process, it is often found that the data being used for imputation have unclear or inconsistent characteristics, which can be solved by implementing Fuzzy C-Means (FCM) method. The estimation of missing values using the FCM model produces predictive models with a variety of parameters, hence another approach to produce the best model with optimal parameters is needed. This underlies the need for a hybrid approach, which is acquired through combining or integrating different machine learning models to earn the best estimation result of missing values. In this study, the implementation of Hybrid Fuzzy C-Means and Majority Vote (Hybrid FCMMV) was conducted on Chronic Obstructive Pulmonary Disease (COPD) data in 2012-2017 from Cipto Mangunkusumo Hospital (RSCM) ) to provide better imputation performance based on accuracy, precision, recall, and F-Score through the classification of the Random Forest ensemble method."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Cressia Nauli Agustin

Pengembangan Model Deep Learning dengan Metode Ensemble untuk Klasifikasi Objek Sampah = Development of Deep Learning Model with Ensemble Method for Classification of Waste Objects

"Permasalahan penumpukan sampah menjadi isu global yang mendesak, memerlukan solusi inovatif untuk deteksi dan klasifikasi yang efisien. Dalam konteks ini, deteksi objek sampah menggunakan deep learning menawarkan potensi besar. Namun, pengembangan model neural network tunggal yang kompleks seringkali menghadapi tantangan keterbatasan kinerja, terutama ketika dihadapkan pada dataset yang terbatas. Penelitian ini bertujuan untuk mengembangkan model deep learning yang robust untuk deteksi objek sampah pada dataset terbatas (TrashNet) dengan memanfaatkan metode ensemble. Pendekatan ensemble, khususnya strategi weighted average, diimplementasikan untuk mengkombinasikan prediksi dari beberapa arsitektur Convolutional Neural Network (CNN) yang berbeda, seperti Xception, ResNet, dan VGG. Model-model dasar ini dilatih secara independen dan bobot optimal untuk setiap model ditentukan melalui proses validasi silang untuk memaksimalkan akurasi. Hasil eksperimen menunjukkan bahwa model ensemble dengan weighted average secara signifikan meningkatkan performa deteksi objek sampah dibandingkan dengan model tunggal. Peningkatan ini ditunjukkan melalui metrik evaluasi seperti akurasi, presisi, recall, dan F1-score yang lebih tinggi. Analisis mendalam mengungkapkan bahwa metode ensemble efektif dalam mengatasi bias dan variasi yang mungkin ada pada model individual, menghasilkan prediksi yang lebih stabil dan akurat pada dataset terbatas. Studi ini menunjukkan bahwa pendekatan ensemble meningkatkan akurasi klasifikasi menjadi 83.27%, atau meningkat ³ 3.35%.

The escalating problem of waste accumulation presents a pressing global issue, demanding innovative solutions for efficient detection and classification. In this context, waste object detection using deep learning offers significant potential. However, developing complex single neural network modelsnetworks often faces performance limitations, especially when confronted with limited datasets. This research aims to develop a robust deep-learning model for waste object detection on limited datasets (TrashNet) by leveraging an ensemble method. The ensemble approach, specifically the weighted average strategy, is implemented to combine predictions from several different Convolutional Neural Network (CNN) architectures, such as Xception, ResNet, and VGG. These base models are trained independently, and optimal weights for each model are determined through a cross-validation process to maximize accuracy. Experimental results demonstrate that the ensemble model with weighted averaging significantly improves waste object detection performance compared to single models. This improvement is shown through higher evaluation metrics such as accuracy, precision, recall, and F1-score. In-depth analysis reveals that the ensemble method is effective in mitigating biases and variations that may exist in individual models, leading to more stable and accurate predictions on limited datasets. This study demonstrates that the ensemble approach improves the classification accuracy to 83.27%, or an increase of ³ 3.35%."

Depok: Fakultas Teknik Universitas Indonesia, 2025

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Muhammad Sabila Haqqi

Pengembangan Metode Long Short Term Memory berbasis Offline dan Online Learning untuk Sistem Identifikasi dan Kendali Quadcopter = Development of Offline and Online Learning-based Long Short Term Memory Methods for Quadcopter Identification and Control Systems

"Banyak sekali variabel nonlinear didalam sistem kendali untuk quadcopter sehingga cukup rumit untuk mengendalikan dinamika penerbangan dari wahana ini. Salah satu metode yang digunakan untuk membangun model dinamik quadcopter adalah Deep Learning berbasis Long Short-Term Memory. Metode pembelajaran yang umum digunakan dalam melatih model adalah offline learning, dimana pelatihan dilakukan secara akumulatif berdasarkan dataset yang telah dimiliki. Walaupun offline learning memungkinkan model belajar lebih cepat, metode ini menghasilkan model yang kurang baik untuk wahana yang membutuhkan feedback dengan kompleksitas tinggi. Untuk menangani masalah tersebut akan dikembangkan metode online learning, dimana data diperoleh secara sekuensial dan digunakan untuk memperbarui model di setiap timestep. Akan ditunjukkan bahwa metode online learning dapat memperbaiki model yang diperoleh dari metode offline learning berdasarkan Mean Square Error dari setiap jenis data quadcopter.

..... There are so many nonlinear variables in the control system for the quadcopter so it is quite complicated to control the flight dynamics of this vehicle. One of the methods used to build a dynamic quadcopter model is Deep Learning based on Long Short-Term Memory. The learning method commonly used in training the model is offline learning, where training is carried out accumulatively based on the existing dataset. Although offline learning allows for faster learning models, this method results in poor models for vehicles that require high complexity feedback. To deal with this problem, an online learning method will be developed, where data is obtained sequentially and used to update the model at each time step. It will be shown that the online learning method can improve the model obtained from the offline learning method based on the Mean Square Error of each quadcopter data type."

Depok: Fakultas Teknik Universitas Indonesia, 2022

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Hasnan Fiqih

Perbandingan Model Transfer Learning pada Klasifikasi Citra Ikan Menggunakan InceptionV3 dan EfficientNetV2L = Comparison of Transfer Learning Models on Fish Image Classification Using InceptionV3 and EfficientNetV2L

"Hampir separuh dunia bergantung pada makanan yang berasal dari laut sebagai sumber protein utama. Di Pasifik Barat dan Tengah 60% dari ikan tuna ditangkap secara illegal, tidak dilaporkan, dan tidak diatur dengan regulasi dapat mengancam ekosistem laut, pasokan ikan global, dan mata pencaharian lokal. Salah satu solusi yang dapat dilakukan adalah dengan menggunakan kamera keamanan untuk menangkap gambar aktivitas kapal. Pada penelitian ini akan dibuat sistem untuk mengklasifikasi jenis ikan yang ditangkap dari gambar kamera keamanan kapal tersebut. Sistem ini menggunakan model transfer learning yang sudah dilakukan fine tuning dan dilatih menggunakan dataset yang disediakan oleh The Nature Conservancy. Dari penelitian ini didapatkan performa terbaik dengan akurasi 98.19% menggunakan model EfficientNetV2L dan optimizer Stochastic Gradient Descent (SGD) dengan learning rate 1e-4, momentum 0.9, weight decay 1e-6, dan split ratio training testing 80/20. Dengan sistem ini pengolahan data untuk menghitung jumlah penangkapan ikan berdasarkan spesies akan lebih efisien.

Almost half of the world depends on food that comes from the sea as the main source of protein. In the West and Central Pacific 60% of tuna fish are caught illegally, unreported and unregulated, threatening marine ecosystems, global fish supplies and local livelihoods. One possible solution is to use a security camera to capture images of ship activity. In this study a system will be created to classify the types of fish caught from the ship's security camera images. This system uses a transfer learning model that has been fine tuned and trained using the dataset provided by The Nature Conservancy. From this study, the best performance was obtained with an accuracy of 98.19% using the EfficientNetV2L model and the Stochastic Gradient Descent (SGD) optimizer with a learning rate of 1e-4, momentum of 0.9, weight decay of 1e-6, and split ratio training testing of 80/20. With this system, data processing to calculate the amount of fish caught by species will be more efficient.
"

Depok: Fakultas Teknik Universitas Indonesia, 2022

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Hajar Indah Fitriasari

Pengembangan Classifier Berbasis Xception dan ResNet50V2 untuk Sistem Deteksi Covid-19 Menggunakan Citra X-ray Rongga Dada = Development of Classifier Based on Xception and ResNet50V2 for Covid-19 Detection System Using Chest X-ray Images

"Pencitraan 'X-ray' dapat digunakan sebagai alternatif penunjang diagnostik klinis untuk mendeteksi penyakit COVID-19 pada paru-paru pasien. 'Machine learning' atau 'Deep Learning' akan disematkan pada 'computer-aided-diagnosis' (CAD) untuk meningkatkan efisiensi dan akurasi dalam menangani permasalahan membedakan COVID-19 dengan penyakit lain yang memiliki karakteristik yang serupa. Beberapa sistem kecerdasan buatan berbasis 'Convolutional Neural Network' (CNN) pada penelitian sebelumnya, memiliki akurasi yang menjanjikan dalam mendeteksi COVID-19 menggunakan citra 'X-ray' rongga dada. Dalam penelitian ini, dikembangkan 'classifier' berbasis CNN dengan teknik 'transfer learning', yakni memanfaatkan model CNN pra-terlatih dari ImageNet bernama Xception dan ResNet50V2 yang dikombinasikan agar sistem menjadi lebih akurat dalam kemampuan ekstraksi fitur untuk mendeteksi COVID-19 melalui citra 'X-ray' rongga dada. 'Classifier' yang dikembangkan terdiri dari 2 jenis, yakni 'classifier' yang disusun secara serial dan paralel. Pengujian dilakukan dalam 2 skenario berbeda. Pada skenario 1, digunakan 'dataset' dan pengaturan parameter yang mengacu pada penelitian sebelumnya, sedangkan skenario 2 dilakukan dengan menambahkan sejumlah citra kedalam 'dataset' baru serta pengaturan parameter yang berbeda untuk memperoleh peningkatan akurasi. Dari pengujian untuk kelas COVID-19 pada skenario 1, diperoleh 'classifier' paralel berhasil menggungguli 'classifier' lain dengan mencapai akurasi rata-rata 93,412% serta memperoleh 'precision', 'recall,' dan 'f1-score' masing – masing mencapai 96.8%, 99.6% dan 98%. Pada skenario 2, 'classifier' paralel mencapai akurasi rata-rata yang lebih tinggi, yakni mencapai 96,678% serta memperoleh 'precision', 'recall,' dan 'f1-score' yang cukup tinggi pula, yakni masing – masing mencapai 98.8%, 99.8% dan 99.4% untuk kelas COVID-19. Adanya penambahan jumlah 'dataset' pada skenario 2 dapat meningkatkan akurasi dari 'classifier' yang dikembangkan. Secara keseluruhan, 'classifier' paralel yang dikembangkan dapat direkomendasikan menjadi alat yang dapat membantu praktisi klinis dan ahli radiologi untuk membantu mereka dalam diagnosis, kuantifikasi, dan tindak lanjut kasus COVID-19.

X-ray imaging can be used as an alternative support clinical diagnostics to detect COVID-19 in the patient's lungs. Machine learning or Deep Learning will be embedded in computer-aided diagnosis (CAD) to increase efficiency and accuracy in dealing with problems distinguishing COVID-19 from other diseases that have similar characteristics. Several artificial intelligence systems based on the Convolutional Neural Network (CNN) in previous studies have promising accuracy in detecting COVID-19 using Chest X-ray images. In this study, a CNN-based classifier with transfer learning techniques was developed, which utilizes a pre-trained CNN model from ImageNet named Xception and ResNet50V2 combined that makes the system powerful using multiple feature extraction capabilities to detect COVID-19 through Chest X-ray images. There are 2 types of classifiers developed, classifiers arranged in serial and parallel. The testing in this study was carried out in two different scenarios. In the scenario 1, the dataset and parameter settings are used referring to previous studies, while the scenario 2 was carried out by adding several images to the new dataset and setting different parameters to obtain increased accuracy. From testing of the COVID-19 class in the scenario 1, the parallel classifier succeeded in outperforming other classifiers by achieving an average accuracy in 93.412% and also obtains precision, recall and f1-score, which reached 96.8%, 99.6%, and 98% respectively. In the scenario 2, the parallel classifier achieved a higher average accuracy of 96.678%, and also obtained quite high precision, recall and f1-score, which reached 98.8%, 99.8% and 99.4% for the COVID-19 class, respectively. The addition of the number of datasets in scenario 2 can increase the accuracy of the developed classifier. Overall, the developed parallel classifier can be recommended as a tool that can help clinical practitioners and radiologists to aid them in diagnosis, quantification, and follow-up of COVID-19 cases."

Depok: Fakultas Teknik Universitas Indonesia, 2021

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Dwi Guna Mandhasiya

Hibrida Model BERT dan Deep Learning untuk Analisis Sentimen Bahasa Indonesia - Studi Kasus Sentimen Pengguna Media Sosial terkait Pilpres 2024 = The Hybrid of BERT and Deep Learning Models for Indonesian Sentiment Analysis - A Case Study of Social Media User Sentiments Regarding the 2024 Indonesia Presidential Election

"Ilmu Data adalah irisan dari matematika dan statistika, komputer, serta keahlian domain. Dalam beberapa tahun terakhir inovasi pada bidang ilmu data berkembang sangat pesat, seperti Artificial Intelligence (AI) yang telah banyak membantu kehidupan manusia. Deep Learning (DL) sebagai bagian dari AI merupakan pengembangan dari salah satu model machine learning yaitu neural network. Dengan banyaknya jumlah lapisan neural network, model deep learning mampu melakukan proses ekstrasi fitur dan klasifikasi dalam satu arsitektur. Model ini telah terbukti mengungguli teknik state-of-the-art machine learning di beberapa bidang seperti pengenalan pola, suara, citra, dan klasifikasi teks. Model deep learning telah melampaui pendekatan berbasis AI dalam berbagai tugas klasifikasi teks, termasuk analisis sentimen. Data teks dapat berasal dari berbagai sumber, seperti sumber dari media sosial. Analisis sentimen atau opinion mining merupakan salah satu studi komputasi yang menganalisis opini dan emosi yang diekspresikan pada teks. Pada penelitian ini analisis peforma machine learning dilakukan pada metode deep learning berbasis representasi data BERT dengan metode CNN dan LSTM serta metode hybrid deep learning CNN-LSTM dan LSTM-CNN. Implementasi model menggunakan data komentar youtube pada video politik dengan topik terkait Pilpres 2024, kemudian evaluasi peforma dilakukan menggunakan confusion metric berupa akurasi, presisi, dan recall.

Data Science is the intersection of mathematics and statistics, computing, and a domain of expertise. In recent years innovation in the field of data science has developed very rapidly, such as Artificial Intelligence (AI) which helped a lot in human life. Deep Learning (DL) as part of AI is the development of one of the machine learning models, namely neural network. With the large number of neural network layers, deep learning models are capable of performing feature extraction and classification processes in a single architecture. This model has proven to outperform state-of-the-art machine learning techniques in areas such as pattern recognition, speech, imagery, and text classification. Deep learning models have gone beyond AI-based approaches in a variety of text classification task, including sentiment analysis. Text data can come from various sources, such as source from social media. Sentiment analysis or opinion mining is a computational study that analyze opinions and emotions expressed in text. In this research, machine learning performance analysis is carried out on a deep learning method based on BERT data representation with the CNN and LSTM and hybrid deep learning CNN-LSTM and LSTM-CNN method. The implementation of the model uses YouTube commentary data on political videos related to the 2024 Indonesia presidential election, then performance analysis is carried out using confusion metrics in the form of accuracy, precision, and recall."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2023

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Adawiyah Ulfa

Evaluasi Klasifikasi Hubungan Kuantitatif Struktur Aktivitas Molekul dengan Model Hybrid Deep Learning dan Pemilihan Fitur Recursive Feature Elimination pada Inhibitor Dipeptidyl Peptidase-4 = Evaluation of the Classification in Quantitative Structures Activity Relationships of Molecular with Hybrid Deep Learning Models and Selection Features of Recursive Feature Elimination in Dipeptidyl Peptidase-4 Inhibitors

"Pengembangan inhibitor Dipeptidyl Peptidae-4 (DPP-4) sangat diperlukan dalam pengobatan Diabetes Mellitus tipe 2 dengan efek samping yang rendah. Pemodelan hubungan kuantitatif struktur aktivitas (QSAR) merupakan pendekatan analisis hubungan struktur kimia dengan aktivitasnya yang banyak digunakan dalam desain obat penyakit Diabetes. Pada tesis ini, model QSAR klasifikasi dibangun untuk memprediksi struktur aktivitas senyawa pada inhibitor DPP-4 yang dapat memblokir kerja enzim DPP-4. Dalam representasi molekul digunakan circular fingerprint ECFP dan FCFP yang menyajikan notasi SMILES dalam format vektor biner. Fingerprint ECFP dan FCFP yang berdiameter 4 dan 6 sebagai input data dalam membangun model QSAR klasifikasi. Pada QSAR klasifikasi dengan pendekatan deep learning memberikan waktu yang cepat dalam proses virtual screening senyawa aktif atau tidak aktif dalam inhibitor DPP-4. Penelitian ini menggunakan model Hybrid Deep Learning 1D CNN-LSTM untuk memprediksi aktivitas senyawa inhibitor dalam kelas aktif atau tidak aktif berdasarkan nilai aktivitas biologis dengan proporsi data latih dan data uji yang berbeda. Dalam arsitektur 1D CNN-LSTM terdiri dari model 1D CNN sebagai tahap ektraksi fitur dan output dari lapisan konvolusi 1D CNN digunakan dalam lapisan LSTM. Selain itu, pemilihan fitur dengan metode Random Forest-Recursive Feature Elimination (RF-RFE) digunakan untuk memperoleh fitur yang optimal dari dataset ECFP dan FCFP. Selanjutnya, penelitian ini membandingkan performa model dengan menerapkan pemilihan fitur RF-RFE dan tanpa pemilihan fitur RF-RFE. Hasil penelitian ini menunjukkan bahwa model QSAR klasifikasi menggunakan Hybrid Deep Learning yaitu 1D CNN-LSTM dengan pemilihan fitur RF-RFE memperoleh performa model yang lebih baik dibandingkan model tanpa pemilihan fitur optimal. Performa model 1D CNN-LSTM dengan pemilihan fitur RF-RFE menggunakan data ECFP_4 dengan proporsi data latih 80% memiliki akurasi sebesar 0.9075, sensitivitas 0.9008, spesifisitas 0.9142, dan nilai MCC 0.8151.

The development of Dipeptidyl Peptidase-4 (DPP-4) inhibitors is urgently needed in the treatment of Type 2 Diabetes Mellitus with low side effects. Activity structure quantitative relationship modeling (QSAR) is an analytical approach to the relationship between chemical structure and activity which is widely used in diabetes drug design. In this thesis, a classification QSAR model was built to predict the structure of the activity of the DPP-4 inhibitor compound that can block the action of the DPP-4 enzyme. In molecular representation, ECFP and FCFP circular fingerprints are used which present SMILES notation in binary vector format. ECFP and FCFP fingerprints with diameters of 4 and 6 as input data in building a classification QSAR model. The QSAR classification with a deep learning approach provides fast time in the virtual screening process for active or inactive compounds in DPP-4 inhibitors. This study uses the Hybrid Deep Learning 1D CNN-LSTM model to predict the activity of inhibitor compounds inactive or inactive classes based on the value of biological activity with different proportions of training data and test data. The 1D CNN-LSTM architecture consists of a 1D CNN model as the feature extraction stage and output of 1D CNN convolution layer is used in the LSTM layer. In addition, feature selection using the Random Forest-Recursive Feature Elimination (RF-RFE) method was used to obtain optimal features from the ECFP and FCFP datasets. Furthermore, this study compares the performance of the model by applying the RF-RFE feature selection and without the RF-RFE feature selection. The results of this study indicate that the classification QSAR model using Hybrid Deep Learning, namely 1D CNN-LSTM with RF-RFE feature selection, obtains better model performance than the model without optimal feature selection. The performance of the CNN-LSTM 1D model with RF-RFE feature selection using ECFP_4 data with a proportion of 80% training data has an accuracy of 0.9075, sensitivity of 0.9008, specificity of 0.9142, and an MCC value of 0.8151.
"

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2021

T-Pdf

UI - Tesis Membership Universitas Indonesia Library

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian