Latar Belakang: Seiring berjalannya pandemi COVID-19, diperlukan tes diagnostik yang lebih baik, cepat, andal, mudah dan tersedia secara luas. Foto rontgen dada digunakan sebagai pemeriksaan awal untuk menegakkan diagnosis kerja. Kecanggihan Artificial Intelligence (AI) diketahui dapat meningkatkan presisi diagnosis Pneumonia pada foto rontgen dada. Salah satu program AI yang sedang marak digunakan adalah CAD4COVID-Xray. Tujuan: Penelitian ini bertujuan untuk mengetahui dan melihat perbedaan performa skoring AI dibanding skoring Brixia pada foto rontgen dada untuk mendiagnosis dan menentukan derajat keparahan pneumonia COVID-19. Metode: Penelitian ini menggunakan desain potong-lintang pada 300 pasien terduga dan terkonfirmasi pneumonia COVID-19. Rontgen dada dinilai secara kuantitatif menggunakan program CAD4COVID dan semi-kuantitatif menggunakan sistem skoring Brixia. Analisa performa diagnostik dinilai menggunakan estimasi AUC dan perbandingannya, serta perbandingan nilai sensitivitas, spesifisitas, nilai prediksi positif, nilai prediksi negatif dan akurasi. Hasil: AI probability score (AUC 0,542, IK95% 0,471-0,613), AI ALA score (AUC 0,442, IK95% 0,375-0,510) dan overall CXR score (AUC 0,461, IK95% 0,393-0,528) tidak memiliki kemampuan diskriminasi hasil RT-PCR SARS CoV-2 pada subjek terduga COVID-19. AI probability score (AUC = 0,888, IK95% 0,820- 0,956), AI ALA score (AUC = 0,875, IK95% 0,789-0,953) dan overall CXR score (AUC = 0,878, IK95% 0,808-0,948) memiliki kemampuan diskriminasi sangat baik untuk menentukan derajat keparahan penyakit subjek terkonfirmasi COVID-19. AI probability score (Sn 87,2%, Acc 85,6%) dan AI ALA score (Sn 82,6%, Acc 80,4%) lebih sensitif dan akurat dibandingkan overall CXR score (Sn 75,6%, Acc 78,4%) untuk mendiskriminasi derajat keparahan penyakit pneumonia COVID-19. Simpulan: AI probability score, AI ALA score dan overall CXR score tidak memiliki kemampuan membedakan hasil RT-PCR SARS CoV-2 pada subjek terduga COVID-19. AI probability score, AI ALA score dan overall CXR score memiliki kemampuan yang sangat baik untuk membedakan derajat keparahan penyakit subjek terkonfirmasi COVID-19. AI probability score dan AI ALA score lebih sensitif dan akurat dibandingkan overall CXR score untuk membedakan derajat keparahan penyakit pneumonia COVID-19.
Background: As the COVID-19 pandemic progresses, a better, faster, reliable, easy and widely available diagnostic tests are needed. Chest X-rays are currently used as an initial examination to confirm a working diagnosis. Advancement of Artificial Intelligence (AI) is known to increase diagnosis precision of pneumonia on chest X-rays. One of the AI programs that is widely being used during the COVID-19 pandemic is CAD4COVID-Xray. Objective: This study aims to determine and compare the performance of AI scoring system using colour heat-map compared to Brixia scoring system on chest X-rays to diagnose and determine the severity of COVID-19 pneumonia. Methods: This study is a cross-sectional study, involving 300 suspected and confirmed COVID-19 pneumonia patients. Chest X-rays were assessed quantitatively using the CAD4COVID program and semi-quantitatively using the Brixia scoring system. Performance analysis is assessed using AUC estimation and its comparison, as well as comparisons of sensitivity, specificity, positive predictive value, negative predictive value and accuracy. Results: AI probability score (AUC 0.542, 95% IK 0.471-0.613), AI ALA score (AUC 0.442, 95% IK 0.375-0.510) and overall CXR score (AUC 0.461, 95% CI 0.393-0.528) did not have the ability to discriminate RT-PCR results of subjects with suspicion of COVID-19. AI probability score (AUC = 0.888, 95% CI 0.820- 0.956), AI ALA score (AUC = 0.875, 95% IK 0.789-0.953) and overall CXR score (AUC = 0.878, 95% CI 0.808-0.948) had excellent strength of agreement to determine disease severity in subjects with confirmed COVID-19. AI probability score (Sn 87.2%, Acc 85.6%) and AI ALA score (Sn 82.6%, Acc 80.4%) are more sensitive and accurate than overall CXR score (Sn 75.6%, Acc 78 ,4%) to determine the severity of COVID-19 pneumonia. Conclusions: AI probability score, AI ALA score and overall CXR score did not have the ability to discriminate RT-PCR results of subjects with suspicion of COVID-19. AI probability score, AI ALA score and overall CXR score had excellent strength of agreement to determine disease severity in subjects with confirmed COVID-19. AI probability score and AI ALA score are more sensitive and accurate than overall CXR score to determine the severity of COVID-19 pneumonia.