UI - Tesis Membership :: Kembali

UI - Tesis Membership :: Kembali

Transfer Learning pada Sistem Closed Domain Question Answering Metadata Statistik = Transfer Learning in Closed Domain Question Answering System of Statistical Metadata

Nur Rachmawati; Evi Yulianti, supervisor; Fariz Darari, examiner; Erdefi Rakun, examiner; Adila Alfa Krisnadhi, examiner (Fakultas Ilmu Komputer Universitas Indonesia, 2024)

 Abstrak

Metadata statistik memiliki peran yang sangat penting bagi masyarakat. Dengan adanya metadata statistik, kita dapat mengetahui segala informasi mengenai semua kegiatan statistik yang dilakukan. Pada penelitian ini kami akan membangun sistem Closed Domain Question Answering (CDQA) mengenai metadata statistik (CDQA-Metadata Statistik). Sistem ini dibangun dengan menggunakan metode transfer learning pada data human question dan automatic question. Penggunaan metode transfer learning digunakan karena benchmark yang besar mengenai metadata statistik belum ada sama sekali. Pada penelitian ini kami akan menggunakan arsitektur retriever(BM25)-reader(IndoBERT) berbasis transfer learning. Ada tiga eksperimen utama yang kami lakukan. Hasil eksperimen pertama kami menunjukkan bahwa pada data human question model twostage fine-tuning (human) yang merupakan model dengan metode transfer learning secara statistik sangat signifikan mengguguli model non transfer learning dengan peningkatan exact match sebesar 53 kali lipat dan f1-score sebesar 9 kali lipat. Kemudian pada data automatic question, model two-stage fine-tuning (automatic) yang merupakan model dengan metode transfer learning secara statistik signifikan mengguguli model non transfer learning dengan peningkatan 80 kali lipat untuk exact match dan 13 kali lipat untuk f1-score. Hasil eksperimen kedua kami menujukkan bahwa sistem CDQAMetadata Statistik berbasis transfer learning secara statistik signifikan lebih baik pada data automatic question dibandingkan data human question. Hal ini mungkin disebabkan pada data automatic question memiliki term-of overlap yang lebih banyak dibandingkan data human question. Lalu pada hasil eksperimen ketiga menunjukkan bahwa pada data human question, penambahan data automatic question saat fine-tuning tidak dapat meningkatkan performa CDQA-Metadata Statistik. Begitu juga pada data automatic question, penambahan data human question saat fine-tuning ternyata tidak dapat meningkatkan performa CDQA-Metadata Statistik.

Statistical metadata plays a very important role in society. With statistical metadata, we can find out all the information regarding all statistical activities carried out. In this research we will build a Closed Domain Question Answering system (CDQA) regarding statistical metadata (CDQA-Statistical Metadata). This system was built using the transfer learning method on human question and automatic question data. The use of the transfer learning method is used because large benchmarks regarding statistical metadata do not yet exist. In this research we will use a retriever (BM25)-reader (IndoBERT) architecture based on transfer learning. There were three main experiments we conducted. The results of our first experiment show that in human question data the two-stage fine-tuning (human) model, which is a model using the transfer learning method, is statistically very significantly superior to the non-transfer learning model with an increase in exact match of 53 times and f1-score of 9 times. Then in the automatic question data, the two-stage fine-tuning (automatic) model, which is a model using the transfer learning method, statistically significantly outperforms the non-transfer learning model with an increase of 80 times for exact match and 13 times for f1-score. The results of our second experiment show that CDQA-Metadata Statistik system based on transfer learning significantly as statistics get better performance in automatic question data than in human question data. This is because automatic question data have more term-of overlap than human question data. Then the results of the third experiment show that for human question data, the addition of the automatic question data during fine-tuning cannot improve the performance of CDQA-Metadata Statistics. Likewise for automatic question data, the addition of a human question data during fine-tuning apparently did not improve the performance of CDQA-Metadata Statistics.

 File Digital: 1

Shelf
 T-Nur Rachmawati.pdf :: Unduh

LOGIN required

 Metadata

Jenis Koleksi : UI - Tesis Membership
No. Panggil : T-pdf
Entri utama-Nama orang :
Entri tambahan-Nama orang :
Entri tambahan-Nama badan :
Program Studi :
Subjek :
Penerbitan : Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2024
Bahasa : ind
Sumber Pengatalogan : LIbUI ind rda
Tipe Konten : text
Tipe Media : computer
Tipe Carrier : online resource
Deskripsi Fisik : xiv, 142 pages : illustration + appendix
Naskah Ringkas :
Lembaga Pemilik : Universitas Indonesia
Lokasi : Perpustakaan UI
  • Ketersediaan
  • Ulasan
  • Sampul
No. Panggil No. Barkod Ketersediaan
T-pdf 15-24-75783690 TERSEDIA
Ulasan:
Tidak ada ulasan pada koleksi ini: 9999920542365
Cover