Hasil Pencarian

Ditemukan 7 dokumen yang sesuai dengan query

Qusyairi Ridho Saeful Fitni

Evaluasi kinerja penerapan metode ensemble learning dan pemilihan fitur pada Intrusion Detection System (IDS) berbasis machine learning = Performance evaluation of machine learning-based intrusion detection system using ensemble learning method and feature selection

"Dalam beberapa tahun terakhir, keamanan data pada sistem informasi organisasi telah menjadi perhatian serius. Banyak serangan menjadi kurang terdeteksi oleh firewall dan perangkat lunak antivirus. Untuk meningkatkan keamanan, intrusion detection systems (IDS) digunakan untuk mendeteksi serangan dalam lalu lintas jaringan. Saat ini, teknologi IDS memiliki masalah kinerja mengenai akurasi deteksi, waktu deteksi, pemberitahuan alarm palsu, dan deteksi jenis serangan baru atau belum diketahui. Beberapa studi telah menerapkan pendekatan pembelajaran mesin (machine learning) sebagai solusi, dan mendapat beberapa peningkatan. Penelitian ini menggunakan pendekatan pembelajaran ensemble (ensemble learning) yang dapat mengintegrasikan manfaat dari setiap algoritma pengklasifikasi tunggal. Pada penelitian ini, dibandingkan tujuh pengklasifikasi tunggal untuk mengidentifikasi pengklasifikasi dasar yang digunakan untuk model ensemble learning. Kemudian dataset IDS terbaru dari Canadian Institute for Cybersecurity yaitu CSE-CIC-IDS2018 digunakan untuk mengevaluasi model ensemble learning. Hasil percobaan menujukan bahwa implementasi metode ensemble learning khususnya majority voting dengan tiga algoritma dasar (gradient boosting, decision tree dan logistic regression) dapat meningkatkan nilai akurasi lebih baik dibandingkan implementasi algoritma klasifikasi tunggal, yaitu 0,988. Selanjutnya, implementasi teknik pemilihan fitur spearman-rank order correlation pada dataset CSE-CIC-IDS2018 menghasilkan 23 dari 80 fitur, dan dapat meningkatkan waktu pelatihan model, yaitu menjadi 11 menit 4 detik dibanding sebelumnya 34 menit 2 detik.

In recent years, data security in organizational information systems has become a serious concern. Many attacks are becoming less detectable by firewall and antivirus software. To improve security, intrusion detection systems (IDSs) are used to detect anomalies in network traffic. Currently, IDS technology has performance issues regarding detection accuracy, detection times, false alarm notifications, and unknown attack detection. Several studies have applied machine learning approaches as solutions. This study used an ensemble learning approach that integrates the benefits of each single classifier algorithms. We made comparisons with seven single classifiers to identify the most appropriate basic classifiers for ensemble learning. Then the latest IDS dataset from the Canadian Institute for Cybersecurity, CSE-CIC-IDS2018, was used to evaluate the ensemble learning model. The experimental results show that the implementation of the ensemble learning method, especially majority voting with three basic algorithms (gradient boosting, decision tree and logistic regression) can increase the accuracy rate better than the implementation of a single classification algorithm, which is 0.988. Furthermore, the implementation of the spearman-rank order correlation feature selection technique in the CSE-CIC-IDS2018 dataset produced 23 of the 80 features, and could increase the model training time, which was 11 minutes 4 seconds compared to 34 minutes 2 seconds before."

Depok: Fakultas Teknik Universitas Indonesia, 2020

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Arvan Aulia Rachman

Klasifikasi data kanker menggunakan fuzzy c-means dengan pemilihan fitur menggunakan fisher's ratio = Classification of cancer data using fuzzy c means with feature selection using fisher's ratio

"Klasifikasi data kanker dilakukan untuk menemukan terapi yang tepat yaitu memaksimalkan efektivitas dan meminimalkan toksisitas. Pada umumnya, data kanker terdiri dari banyak fitur. Namun, tidak semua fitur tersebut informatif. Oleh karena itu, fitur-fitur tersebut akan diseleksi menggunakan metode Fisher's Ratio untuk memilih fitur-fitur yang paling informatif. Fitur-fitur terbaik akan dibentuk data baru. Data, sebelum dan setelah dilakukan pemilihan fitur, diklasifikasi menggunakan metode Fuzzy C-Means. Akurasi dari proses klasifikasinya akan dibandingkan. Hasilnya, tanpa melakukan pemilihan fitur, diperoleh rata-rata akurasi sebesar 82.92%. Setelah dilakukan pemilihan fitur, diperoleh akurasi terbaik dengan menggunakan 150 fitur dengan rata-rata akurasi sebesar 89.68%.

Classification of cancer data is done to find the right therapy that maximize efficacy and minimize toxicity. In general, cancer data consists of many features. However, not all of these features are informative. Therefore, these features will be selected using Fisher's Ratio to choose features that are most informative. The best features to be formed new data. Data, before and after feature selection, are classified using Fuzzy C-Means. The accuracy of the classification process will be compared. As a result, without doing feature selection, the accuracy is 82.92%. After doing feature selection, the best accuracy is obtained by using 150 features with the accuracy is 89.68%."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2016

S64140

UI - Skripsi Membership Universitas Indonesia Library

Devina Itsnia Rizka

Klasifikasi data kanker serviks menggunakan metode na ve bayes dengan pemilihan fitur artificial bee colony = Cervical cancer classification using na ve bayes method with artificial bee colony as feature selection

"ABSTRAK

Kanker serviks merupakan salah satu jenis kanker yang berbahaya. Berdasarkan data dari Departemen Kesehatan Republik Indonesia Depkes RI , kanker serviks merupakan salah satu penyakit kanker dengan prevelensi tertinggi sebesar 0.8 di Indonesia. Maka dari itu diperlukan tindakan pendeteksian dini dengan menggunakan microarray dataset. Microarray dataset mempunyai jumlah fitur yang banyak tetapi tidak semua fitur yang ada relevan dengan data yang digunakan. Oleh karena itu, perlu dilakukan pemilihan fitur untuk meningkatkan akurasi. Pemilihan fitur yang digunakan adalah Artificial Bee Colony ABC . Setelah dilakukan pemilihan fitur, akan dilakukan klasifikasi menggunakan metode klasifikasi Na ve Bayes. Hasilnya, didapatkan akurasi terbaik klasifikasi Na ve Bayes tanpa pemilihan fitur adalah 60 pada saat data training 90 dan untuk klasifikasi Na ve Bayes dengan menggunkan pemilihan fitur Artificial Bee Colony didapatkan akurasi tertinggi adalah 93.33333 . dengan fitur sebanyak 50 dan data training 90

ABSTRACT

Cervical cancer is one of the most dangerous cancer. Based on data from Departemen Kesehatan Republik Indonesia Depkes RI , cervical cancer is one of the diseases with the highest prevalence of 0.8 in Indonesia. Therefore, early detection action is needed with using microarray dataset. Microarray datasets have a large number of features but not all features are relevant to the data is used. Therefore, feature selection is needed to improve the accuracy. The feature selection that used is Artificial Bee Colony ABC . After feature selection process is done, Naive Bayes classification method will be implemented for classification process. As a result, the best accuracy of Na ve Bayes classification without feature selection is 60 with 90 training data and for Na ve Bayes classification using Artificial Bee Colony feature selection is 93.33333 with using 50 features selection and 90 training data."

2017

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Nurrimah

Aplikasi fuzzy support vector regression dengan pemilihan fitur fisher score dalam memprediksi harga saham = Application of fuzzy support vector regression with fisher score feature selection in predicting stock price

"Globalisasi membawa dampak besar bagi pertumbuhan ekonomi Indonesia. Sejak tahun 1961, secara umum pertumbuhan ekonomi Indonesia selalu mengalami kenaikan. Banyak faktor yang menyebabkan meningkatnya pertumbuhan ekonomi nasional. Salah satunya adalah investasi. Terdapat berbagai macam instrumen investasi. Sekarang ini yang paling banyak diminati oleh masyarakat umum adalah investasi saham. Bursa Efek Indonesia (BEI) mencatat bahwa per Juni 2018 banyaknya investor pasar modal mencapai 1,12 juta Single Investor Identification (SID) dengan 710.000 Single Investor Identification (SID) merupakan total investor saham ritel. Saham menjadi salah satu usaha dalam pemenuhan kebutuhan hidup di masa depan. Daya tarik utamanya adalah karena saham memberikan potensi keuntungan yang tinggi dalam jangka panjang. Namun, dengan potensi keuntungan yang tinggi tersebut, saham juga memiliki potensi kerugian yang tinggi. Salah satu usaha untuk meminimalkan potensi kerugian saham adalah dengan melakukan prediksi harga saham menggunakan machine learning. Harga saham akan diprediksi menggunakan metode penyelesaian masalah regresi, yaitu Fuzzy Support Vector Regression (FSVR). Fungsi pemetaan dalam fungsi keanggotaan fuzzy digunakan untuk menghasilkan fluktuasi harga saham yang tepat. Untuk memastikan keefektifan dan keefisienan penggunaan fitur, Fisher Score digunakan untuk memilih fitur yang paling berpengaruh dan informatif dalam model prediksi sehingga kesalahan hasil prediksi dapat diminimalkan. Fitur-fitur terpilih tersebut akan dijadikan sebagai variabel input dalam model prediksi. Evaluasi hasil prediksi dari data dengan dan tanpa dilakukan pemilihan fitur selanjutnya akan dianalisis menggunakan Normalized Mean Square Error (NMSE) dan dibandingkan sebagai bagian dari evaluasi performa model prediksi. Dari hasil prediksi pada salah satu data yang digunakan, tanpa pemilihan fitur, diperoleh model terbaik dengan nilai NMSE terendah sebesar 0,179 dan persentase data training 80%, sedangkan dengan pemilihan fitur Fisher Score, diperoleh model terbaik menggunakan sembilan fitur dengan nilai NMSE terendah sebesar 0,011 dan persentase data training 90%.

Globalization has a big impact on Indonesias economic growth. Since 1961, in general Indonesias economic growth has always increased. Many factors have led to an increase in national economic growth. One of which is investment. There are many investment instruments. The most popular among the public is stock investment. Indonesia Stock Exchange (IDX) recorded as of June 2018 total of capital market investors reached 1,12 million Single Investor Identification (SID) with 710,000 Single Investor Identification (SID) representing total retail stock investors. Stock has become one of the activities to fulfill the needs of life in the future. Its main attraction is that stock provides high potential return of profit in long run. However, as high return of profit, stock also has high potential return of risks. One of the ways to minimize the potential return of risks is by predicting stock prices using machine learning. The stock prices will be predicted using a regression problem solving method, namely Fuzzy Support Vector Regression (FSVR). The mapping function in fuzzy membership function is used to produce the right stock price fluctuations. To ensure the effectiveness and the efficiency of using features, Fisher Score is used to select the most influential and informative features in the prediction model so that the prediction errors can be minimized. These selected features will be used as input variables in the stock price prediction model. The evaluation of the prediction results from the data with and without feature selection will be analyzed using Normalized Mean Square Error (NMSE) and compared as part of the performance evaluation of the prediction model. From the prediction results on one of data used, without doing feature selection, the best model is obtained with the lowest error is 0.179 and 80% training data, while with doing Fisher Score feature selection, the best model is obtained by using nine features with the lowest error is 0.011 and 90% training data."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2019

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Tya Nadira

Klasifikasi data kanker menggunakan metode support vector machines dengan pemilihan fitur berdasarkan artificial bee colony dan global artificial bee colony = Classification of cancer data using support vector machines method with features selection based on artificial bee colony and global artificial bee colony

"ABSTRAK

Kanker merupakan penyebab utama kematian kedua di seluruh dunia sehingga mengakibatkan kanker menjadi salah satu prioritas masalah dalam kesehatan. Di Indonesia, tercatat bahwa kanker payudara dan kanker paru-paru memiliki angka kejadian dan kematian tertinggi bagi wanita dan pria WHO, 2014 . Untuk menangani hal tersebut, dalam tugas akhir ini diusulkan suatu metode untuk mengklasifikasikan data kanker menggunakan Support Vector Machines SVM dengan pemilihan fitur berdasarkan Artificial Bee Colony ABC dan Global Artificial Bee Colony GABC pada data kanker payudara dan paru-paru berbasis microarray. Hasil yang diperoleh menunjukkan bahwa metode pemilihan fitur ABC dan GABC memberikan hasil rata-rata akurasi yang lebih tinggi dibandingkan tanpa dilakukan pemilihan fitur dalam klasifikasi data kanker. Untuk pemilihan fitur, metode GABC memberikan hasil yang lebih unggul yaitu dengan akurasi tertinggi 99,99 dengan 10 fitur untuk data kanker paru-paru dan 96,4286 dengan 10 fitur untuk data kanker payudara selama 3 kali running sedangkan metode ABC memberikan rata-rata akurasi tertinggi 99,99 dengan 20 fitur untuk data kanker paru-paru dan 96,4286 dengan 10 fitur untuk data kanker payudara selama 5 kali running.

ABSTRACT

Cancer is the second leading cause of death globally, so that cancer becomes one of priority problems in health. According to WHO on 2014, Indonesia has breast cancer and lung cancer that is the highest incidence and death rates for women and men. To overcome it, in this research, we proposed method to classify cancer data using Support Vector Machines SVM with features selection based on Artificial Bee Colony ABC and Global Artificial Bee Colony GABC on breast and lung cancer based on microarray data. The results show that ABC and GABC as features selection method produced higher average classification accuracy than without no features selection. For features selection methods, the GABC method provides higher results with the highest 99,99 with 10 features for lung cancer data and 96,4286 with 10 features for breast cancer data for 3 times of runs while ABC method provides 99,99 with 20 features for data lung cancer and 96,4286 with 10 features for breast cancer data for 5 times of runs."

2017

S69844

UI - Skripsi Membership Universitas Indonesia Library

Haris Hamzah

Prediksi hubungan struktur molekul dan aktivitas biologi inhibitor dipeptidyl peptidase-4 menggunakan metode deep neural network dengan metode pemilihan fitur catboost = Prediction of molecular structure and biological activity relationship of dipeptidyl peptidase-4 inhibitors using deep neural networks with catboost as feature selection method

"Diabetes mellitus tipe-2 (T2DM) merupakan penyakit metabolisme kronis yang sering diderita oleh orang dewasa. T2DM ditandai dengan menurunnya insulin dalam tubuh. Enzim dipeptidil peptidase-4 (DPP-4) dapat mengkatalisasi penurunan hormon peptida inkretin, terutama peptide-1 seperti hormon gastric inhibitory peptide (GIP) dan glucagon-like peptide-1 (GLP-1), yang mengakibatkan penurunan sintesis insulin. Inhibitor DPP-4 adalah target obat yang menjanjikan untuk T2DM, karena dapat memblokir kerja enzim DPP-4 dengan menghambat kerja hormon GLP-1 dan GIP. Penelitian ini menggunakan data inhibitor DPP-4 yang akan diekstraksi ciri menggunakan metode Extended-Connectivity Fingerprint (ECFP) dan Functional-Class Fingerprints (FCFP). Hasil ekstraksi ciri tersebut digunakan sebagai vektor masukan untuk metode deep neural network (DNN) untuk memprediksi inhibitor DPP-4 ke dalam senyawa aktif dan tidak aktif. Selain itu, metode CatBoost diusulkan sebagai metode pemilihan fitur terhadap hasil ekstraksi ciri metode ECFP dan FCFP. Dalam penelitian ini akan membandingkan performa metode DNN dengan menggunakan pemilihan fitur metode CatBoost dan tanpa menggunakan pemilihan fitur metode CatBoost. Hasil dari penelitian ini menunjukkan bahwa metode DNN menggunakan ekstraksi ciri ECFP_6 dengan proporsi pemilihan fitur sebesar 90% memiliki nilai sensitivitas, spesifisitas, akurasi, dan MCC berturut-turut adalah 0.927,0.881,0.906, dan 0.810.

Diabetes mellitus type-2 (T2DM) is a chronic metabolic disease that often affects adults. T2DM is characterized by a decrease of insulin in the body. The dipeptidyl peptidase-4 (DPP-4) enzyme can catalyze a decrease of incretin peptide hormones, especially peptide-1, such as gastric inhibitory peptide (GIP) hormone and glucagon-like peptide-1 (GLP-1), which results in decreased insulin synthesis. DPP-4 inhibitors are a promising drug target for T2DM because they block the action of the DPP-4 enzyme by inhibiting the activity of the GLP-1 and GIP hormones. This study uses DPP-4 inhibitor data, which will be feature extracted using the Extended-Connectivity Fingerprint (ECFP) and Functional-Class Fingerprints (FCFP) methods. The results of feature extraction are used as input vectors of the deep neural network (DNN) method to predict DPP-4 inhibitors into active and inactive compounds. In addition, the CatBoost method is proposed as a feature selection method for the feature extraction results of the ECFP and FCFP methods. In this study, we will compare the performance of the DNN method using the feature selection of the CatBoost method and without using the feature selection of the CatBoost method. The results of this study indicate that the DNN method using feature extraction ECFP_6 with 90% of the feature selection having sensitivity, specificity, accuracy, and MCC values, respectively, 0.927, 0.881, 0.906, and 0.810."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Haris Hamzah

Prediksi hubungan struktur molekul dan aktivitas biologi inhibitor dipeptidil peptidase-4 menggunakan metode deep neural network dengan metode pemilihan fitur catboost = Prediction of molecular structure and biological activity relationship of dipeptidyl peptidase-4 inhibitors using deep neural networks with catboost as feature selection method

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian