Hasil Pencarian

Ditemukan 2 dokumen yang sesuai dengan query

Putri Maharani Ardiasih

Konstruksi model Zero-Inflated Ordered Probit untuk prediksi tingkat keparahan cedera akibat kecelakaan lalu lintas = Construction of Zero-Inflated Ordered Probit model for predicting injury severity due to traffic accident

Analisis regresi merupakan metode statistika yang digunakan untuk mengetahui pola hubungan antara variabel respon dengan satu atau lebih variabel prediktor. Model regresi yang sering digunakan adalah model regresi linier dengan asumsi variabel respon berdistribusi normal. Pengembangan dari model regresi linier adalah Generalized Linear Model (GLM). Salah satu komponen dari GLM adalah fungsi penghubung yang digunakan untuk menghubungkan variabel respon dengan prediktor linier. Pemilihan fungsi penghubung ini bergantung pada jenis variabel respon. Pada variabel respon kategorik di mana tidak ada keterurutan (nominal), salah satu GLM yang dapat digunakan adalah model dengan fungsi penghubung probit. Namun, model ini tidak dapat digunakan untuk variabel respon kategorik dengan keterurutan (ordinal). Untuk itu, dikembangkan model ordered probit untuk menganalisis variabel respon ordinal dengan fungsi penghubung probit. Namun, jika kategori 0 memiliki observasi yang lebih banyak dibandingkan kategori lainnya, maka terjadi zero-inflation pada data respon. Hal ini dapat menyebabkan terjadinya overdispersi yang berakibat pada kesalahan interpretasi model dan kesalahan pengambilan kesimpulan. Untuk itu, diperlukan pengembangan dari model ordered probit, yaitu model zero-inflated ordered probit. Estimasi parameter model zero- inflated ordered probit dilakukan menggunakan metode maximum likelihood. Implementasi dari model zero-inflated ordered probit digunakan untuk memprediksi tingkat keparahan cedera akibat kecelakaan lalu lintas berdasarkan data kecelakaan lalu lintas. Sebagai variabel respon ordinal pada model zero-inflated ordered probit adalah tingkat keparahan cedera yang memiliki tiga kategori, yaitu tidak terjadi cedera berat, cedera serius, dan cedera fatal. Pada data tersebut, kategori tidak terjadi cedera berat (0) memiliki observasi yang lebih banyak dibandingkan kategori lainnya sehingga terjadi zero-inflation. Sebagai variabel prediktor pada model zero-inflated ordered probit adalah jumlah kendaraan yang terlibat dan jumlah korban jiwa. Hasil penelitian ini menyimpulkan bahwa model zero-inflated ordered probit dengan jumlah kendaraan yang terlibat sebagai inflation variable dan jumlah korban jiwa sebagai variabel prediktor merupakan model terbaik. Model ini memberikan prediksi yang sesuai dengan nilai sebenarnya dengan akurasi 85,36%. Diperoleh pula nilai AIC dari model ini sebesar 2614,282.

Regression analysis is a statistical method used to determine the pattern of relationship between a response variable and one or more predictor variables. The regression model that is often used is the linear regression model with the assumption that the response variable is normally distributed. The development of the linear regression model is the Generalized Linear Model (GLM). One of the components of GLM is a link function which is used to connect response variables with linear predictors. The choice of this connecting function depends on the type of response variable. For categorical response variables where there is no ordering (nominal), one GLM that can be used is a model with probit link function. However, this model cannot be used for categorical response variables with ordinal order. For this reason, ordered probit model was developed to analyze ordinal response variables with a probit link function. However, if category 0 has more observations than other categories, then zero-inflation occurs in the data response. This can cause overdispersion which results in misinterpretation of the model and wrong conclusions. For this reason, it is necessary to develop ordered probit model, namely zero- inflated ordered probit model. Zero-inflated ordered probit model parameter estimation was estimated using maximum likelihood method. The implementation of zero-inflated ordered probit model is used to predict the severity of injuries resulting from traffic accidents based on traffic accident data. As an ordinal response variable in the zero- inflated ordered probit model is the severity of injury which has three categories, namely slight injury, serious injury, and fatal injury. In this data, the category of slight injury (0) has more observations than other categories, resulting in zero-inflation. As predictor variables in the zero-inflated ordered probit model are the number of vehicles involved and the number of casualties. The results of this research conclude that the zero-inflated ordered probit model with the number of vehicles involved as the inflation variable and the number of fatalities as the predictor variable is the best model. This model provides predictions that match the actual values with an accuracy of 85,36%. The AIC value of this model was also obtained at 2614,282."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2024

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Ivana Ratanaputri

Model Regresi Bayesian Zero-Inflated Bernoulli = Bayesian Zero-Inflated Bernoulli Regression Model

"Data biner merupakan tipe data yang memiliki tepat dua kemungkinan nilai, seperti sukses dan gagal atau ya dan tidak, yang lebih lanjut direpresentasikan dalam respon 0 dan 1. Data biner kerap dijumpai dalam kehidupan sehari-hari. Namun, tidak jarang pula ditemukan data biner yang mengalami zero-inflation. Fenomena zero-inflation ini merujuk kepada data dengan dua sumber nilai nol yang berbeda, yang dikenal dengan istilah structural zeros dan sampling zeros. Oleh karena itu, dikembangkanlah suatu model alternatif, yakni model regresi Zero-Inflated Bernoulli untuk memodelkan data biner yang mengalami zero-inflation. Dalam inferensi statistika, terdapat dua jenis pendekatan yang umumnya digunakan, yaitu pendekatan frekuentis dan pendekatan Bayesian. Pada tugas akhir ini, dikonstruksi suatu model regresi Zero-Inflated Bernoulli menggunakan pendekatan Bayesian. Pendekatan Bayesian digunakan karena dianggap lebih unggul dibandingkan pendekatan frekuentis. Dalam data yang mengalami zero inflation, pendekatan frekuentis tidak mampu membedakan structural zeros dan sampling zeros. Hasil konstruksi model yang terbentuk diberi nama model regresi Bayesian Zero-Inflated Bernoulli. Salah satu hal penting dalam pendekatan Bayesian adalah mendapatkan distribusi posterior. Namun, sering kali nilai parameter dari distribusi posterior sulit ditemukan secara analitik karena distribusi posteriornya memiliki formula terbuka. Oleh karena itu, dalam tugas akhir ini estimasi parameter sekaligus pembangunan sampel posterior dicari melalui teknik komputasional dengan algoritma No-U-Turn Sampler (NUTS). Selanjutnya, model regresi Bayesian Zero-Inflated Bernoulli diimplementasikan untuk masalah klasifikasi pada data sickness presenteeism. Dalam tugas akhir ini, dibangun dua buah model regresi Bayesian Zero-Inflated Bernoulli, yakni model tanpa kovariat dan model dengan kovariat. Dari model tanpa kovariat, diperoleh estimasi parameter distribusi variabel respon adalah p1 = 0.38 dan p2 = 0.75. Lebih lanjut, hasil estimasi probabilitas yang diperoleh mendekati nilai empirisnya. Pada model dengan kovariat, digunakan dua kovariat untuk dua bagian yang berbeda, yakni evaluasi kondisi kesehatan (gh) pada seluruh sampel dan kovariat frekuensi merasakan perasaan takut tergantikan apabila tidak masuk kerja (remplz) pada sampel at-risk, hasil estimasi parameter regresi akan menghasilkan persamaan regresi yang dapat digunakan memberikan prediksi klasifikasi variabel respon kondisi pekerja yang masuk kerja pada saat sedang sakit. Diperoleh, berturut-turut tingkat akurasi dari model dengan kovariat gh dan kovariat remplz adalah sebesar 72.44% dan 69.58%, tingkat sensitivitas sebesar 14.65 % dan 100.00%, serta tingkat specificity sebesar 94.35% dan 0.00%.

Binary data is type of data that have exact two outcomes, for instance, success and failure or yes and no, that usually represent in 0 and 1. Binary data can be easily find on daily basis. However, there is binary data that experienced with zero-inflation. Zero-inflation phenomenon is caused by two different sources of zeros, which is called structural zeros and sampling zeros. Therefore, Zero-Inflated Bernoulli regression model is constructed for modeling binary data that experienced zero-inflation. There are two statistical inferences that is commonly used, that is frequentist and Bayesian inference. This thesis constructed Zero-Inflated Bernoulli regression model with Bayesian inference. Bayesian inference is selected because it is more superior than frequentist inference on modeling binary data with two different source of zeros. Frequentist inference unable to distinguish the difference between structural zeros and sampling zeros. Constructed model is called Bayesian Zero-Inflated Bernoulli regression model. In Bayesian inference, it is important to get the predicted posterior distribution. However, in some cases, the analytic estimation of the posterior distribution is difficult to calculate because it has open formula. Therefore, posterior estimator is searched using computational techniques name No-U-Turn Sampler algorithm (NUTS). Furthermore, this regression model is implemented on classification problem sickness presenteeism data. In this thesis, we constructed two models, that is model without covariates dan model with covariates. From model without covariates, the parameter from response variable distribution can be estimated and we got p1 = 0.38 dan p2 = 0.75. This results is closed to the empirical value. Then, from model with covariates, two covariates is considered on implementation for different parts, i.e. general state of health (gh) covariate for all sample and feeling for being replaced (remplz) covariate for at-risk sample. From the estimated regression parameters, the regression equation is able give classification predictions for attend work while sick as response variable (sp recod). The results are the model give 72.44% and 69.58% accuracy rate."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2024

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian