Hasil Pencarian

Ditemukan 8179 dokumen yang sesuai dengan query

Information retrieval of text document with weighting tf-idf and lcs

"Information retrieval of text document requires a method that is able to restore a number of documents that have high relevance according to the user's request. One important step in the process is a text representation of the weighting process. The use of LCS in Tf-Idf weighting adjustments considers the appearance of the same order of words between the query and the text in the document. There is a very long document but irrelevant cause weight produced is not able to represent the value relevance of documents. This research proposes the use of LCS which gives weight to the word order by considering long documents related to the average length of documents in the corpus. This method is able to return a text document effectively. Additional features of word order by normalizing the ratio of the overall length of the document to the documents in the corpus generate values of precision and recall as well as the method of Tasi et al.

Sistem temu kembali dokumen teks membutuhkan metode yang mampu mengembalikan sejumlah dokumen yang memiliki relevansi tinggi sesuai dengan permintaan pengguna. Salah satu tahapan penting dalam proses representasi teks adalah proses pembobotan. Penggunaan LCS dalam penyesuaian bobot Tf-Idf mempertimbangkan kemunculan urutan kata yang sama antara query dan teks di dalam dokumen. Adanya dokumen yang sangat panjang namun tidak relevan menyebabkan bobot yang dihasilkan tidak mampu merepresentasikan nilai relevansi dokumen. Penelitian ini mengusulkan penggunaan metode LCS yang memberikan bobot urutan kata dengan mempertimbangkan panjang dokumen terkait dengan rata-rata panjang dokumen dalam korpus. Metode ini mampu melakukan pengembalian dokumen teks secara efektif. Penambahan fitur urutan kata dengan normalisasi rasio panjang dokumen terhadap keseluruhan dokumen dalam korpus menghasilkan nilai presisi dan recall yang sama baiknya dengan metode Tasi dkk."

Surabaya: Institut Teknologi Sepuluh Nopember Surabaya, Faculty of Information Technology, Department of Infromatics Engineering, 2013

AJ-Pdf

Artikel Jurnal Universitas Indonesia Library

Manning, Christopher D.

Introduction to information retrieval

Cambridge, UK: Cambridge University Press, 2008

025.04 MAN i

Buku Teks SO Universitas Indonesia Library

Khadijah Fahmi Hayati Holle

Preference based term weighting for arabic fiqh document ranking

"In document retrieval, besides the suitability of query with search results, there is also a subjective user assessment that is expected to be a deciding factor in document ranking. This preference aspect is referred at the fiqh document searching. People tend to prefer on certain fiqh metho-dology without rejecting other fiqh methodologies. It is necessary to investigate preference factor in addition to the relevance factor in the document ranking. Therefore, this research proposed a method of term weighting based on preference to rank documents according to user preference. The proposed method is also combined with term weighting based on documents index and books index so it sees relevance and preference aspect. The proposed method is Inverse Preference Fre-quency with α value (IPFα). In this method, we calculate preference value by IPF term weighting. Then, the preference values of terms that is equal with the query are multiplied by α. IPFα combin-ed with the existing weighting methods become TF.IDF.IBF.IPFα. Experiment of the proposed me-thod uses dataset of several Arabic fiqh documents. Evaluation uses recall, precision, and f-mea-sure calculations. Proposed term weighting method is obtained to rank the document in the right order according to user preference. It is shown from the result with recall value reach 75%, preci-sion 100%, and F-measure 85.7% respectively.

Dalam pencarian, selain kesesuaian query dengan hasil pencarian, terdapat penilaian subjektif pengguna yang diharapkan menjadi faktor penentu dalam perangkingan dokumen. Aspek prefe-rensi tersebut tampak pada pencarian dokumen fiqih. Seseorang cenderung mengutamakan meto-dologi fiqih tertentu meskipun tidak mengabaikan pendapat metodologi fiqih lain. Faktor prefe-rensi menjadi hal yang diperlukan selain relevansi dalam perangkingan dokumen. Oleh karena itu, pada penelitian ini diajukan metode pembobotan kata berbasis preferensi untuk merangkingkan dokumen sesuai dengan preferensi pengguna. Metode yang diajukan digabungkan dengan pembo-botan kata berbasis indeks dokumen dan buku sehingga mampu memperhatikan aspek kesesuaian (relevance) dan keutamaan (preference). Metode pembobotan yang diusulkan disebut dengan Invers Preference Frequency with α value (IPFα). Langkah pembobotan yang diusulkan yaitu de-ngan perhitungan nilai preferensi term dengan pembobotan IPF. Kemudian nilai preferensi dari term dokumen yang sama dengan term query dikalikan dengan 𝜶𝜶 sebagai penguat. IPFα digabung-kan dengan metode pembobotan yang telah ada menjadi TF.IDF.IBF.IPFα. Pengujian metode yang diusulkan menggunakan dataset dari beberapa dokumen fiqih berbahasa Arab. Evaluasi meng-gunakan perhitungan recall, precision, dan F-measure. Hasil uji coba menunjukkan bahwa dengan pembobotan TF.IDF.IBF.IPFα diperoleh perangkingan dokumen dengan urutan yang tepat dan se-suai dengan preferensi pengguna. Hal ini ditunjukkan dengan nilai maksimal recall mencapai 75%, precision 100%, dan F-measure 85.7%."

Surabaya: Institut Teknologi Sepuluh Nopember Surabaya, Faculty of Information Technology, Department of Infromatics Engineering, 2015

AJ-Pdf

Artikel Jurnal Universitas Indonesia Library

Ika Mailani

Perancangan sistem informasi pengelolaan dokumen elektronik: studi kasus Sekretariat Wakil Presiden = Electronic document management information system design: a case study of the Vice President Secretariat of Republic Indonesia

"Tugas utama dari Sekretariat Wakil Presiden adalah terwujudnya dukungan teknis, administrasi, dan analisis kepada Wakil Presiden terutama dalam merumuskan rekomendasi kebijakan di bidang ekonomi, sosial dan pemerintahan. Dalam rangka membuat analisis kebijakan yang tajam, dan berkualitas, organisasi perlu mengolah seluruh dokumen yang diperlukan, terutama analisis-analisis yang telah dibuat sebelumnya. Namun demikian, dalam pelaksanaannya para pegawai Sekretariat Wakil Presiden selaku pelaksana tugas analisis kebijakan mengalami kesulitan dalam pencarian dokumen tersebut. Kondisi ini mendapat perhatian dalam Rapat Evaluasi Akhir Tahun 2016 dan menjadikan hal itu salah satu kendala yang dihadapi oleh organisasi dalam memberikan pelayanan kepada Wakil Presiden.

Rekomendasi yang disampaikan adalah pentingnya menyediakan suatu database yang berisi data dari seluruh kedeputian substansi. Database ini berperan sebagai sumber data dan dokumentasi bagi kedeputian substansi di Sekretariat Wakil Presiden dalam melaksanakan tugas utamanya yaitu memberikan dukungan analisis kebijakan. Kebutuhan akan database tersebut akan diakomodir dalam Sistem Informasi Pengelolaan Dokumen Elektronik. Sistem ini akan dikembangkan dengan menggunakan metodologi pengembangan sistem yang mengacu pada metodologi waterfall yang dimodifikasi sehingga tahapannya menjadi planning, analysis dan design dengan menggunakan Unified Modeling Language (UML).

Penelitian ini menghasilkan prototype Sistem Informasi Pengelolaan Dokumen Elektronik di Sekretariat Wakil Presiden dengan fitur pencarian, pengelolaan dokumen (unduh, tambah, ubah, hapus dan tampilan metadata), penyampaian dokumen beserta history penyusunan laporan secara berjenjang mulai dari analis, kasubbid, kasubbag, asdep sampai dengan deputi, fitur adminsitrator sebagai pengelola pengguna dan dashboard data dan fitur dashboard yang berisi tampilan rekap dari dokumen beserta statusnya.

The main duty of The Vice President Secretariat of Republic Indonesia is giving technical, administrative and analytical support to the Vice President of the Republic Indonesia in making government policies especially in economic, social, and governance affairs. In giving in-depth and sharp analysis of government policies, the institution needs to manage all the documents needed to make the analysis. In the implementation, the employees had difficulties in searching the documents as the source in making policy analysis. This condition received attention in the Annual Evaluation Meeting in 2016 as one of the organization's obstacle in giving support to the Vice President.
One of the recommendation to this problem is the importance of providing a database containing data of policies analysis from all deputies in the organization. The database serves as a source of data and documentation for all deputies at the Vice President's Secretariat in carrying out its main task of providing policy analysis support. The need for such databases will be accommodated in the Electronic Document Management Information System. The system was developed using a system development methodology that refers to the modified waterfall methodology so that its stages become planning, analysis and design using Unified Modeling Language (UML).
This research produces prototype of Information System of Electronic Document Management with search feature, document management (download, add, change, delete and metadata display), document submission along with history of tiered report preparation starting from analyst, head of sub field, head of field, deputy assistant up to deputy, adminsitrator feature as user manager and data dashboard and dashboard feature containing recap view of document along with its status."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2018

TA-Pdf

UI - Tugas Akhir Universitas Indonesia Library

Mohammad Aufar Sadikin

Akses informasi publik di Kementerian Pertanian Republik Indonesia = Publik information access in the Ministry of Agriculture of Republik Indonesia

"Penelitian ini membahas megenai akses keterbukaan informasi publik di Kementerian Pertanian Republik Indonesia. Tujuan penelitian ini adalahmenggambarkan akses informasi yang ada, dimulai dari bagaimana akses dapat tersedia hingga kualitas dari akses informasi tersebut. Pendekatan penelitian adalah kualitatif dengan menggunakan metode studi kasus. Hasil temuan dari penelitian ini menunjukkan bahwa akses informasi publik di Kementerian Pertanian Republik Indonesia masih memiliki kekurangan dari segi keterhubungan antar akses yang tersedia.

The focus of this study is the access of public information in Ministry of Agricluture Indonesia. This research aims to depict the information access, starts from how the access made available up to the quality of access. Qualitative based research with case study analysis is used in this study. The findings of this study indicate that access to public information in the Ministry of Agriculture Indonesia is not maximized yet, because there is no clear connection between the access."

Depok: Fakultas Ilmu Pengetahuan Budaya Universitas Indonesia, 2016

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Rifqi Wazirsyah

Klasifikasi Performa Mahasiswa Berdasarkan Data Teks Forum Diskusi Online Menggunakan Multinomial Naive Bayes dengan Vektorisasi Pembobotan TF-IDF = Classification of Student Performance based on Text Data of Online Discussion Forums using Multinomial Naive Bayes with TF-IDF Weighting Vectorization

"E-Learning Management System (EMAS) merupakan aplikasi yang dibuat oleh Universitas Indonesia dengan berbagai fitur salah satunya forum diskusi online. Dalam forum diskusi online, mahasiswa dapat membuat postingan-postingan dalam bentuk teks untuk bisa berdiskusi. Postingan-postingan dalam bentuk teks memiliki peran penting dalam meningkatkan performa mahasiswa yang terkhusus pada kelulusannya. Pada tugas akhir ini, Multinomial Naïve Bayes (MNB) digunakan untuk mengklasifikasi performa mahasiswa berdasarkan postingan-postingan dalam bentuk teks pada forum diskusi online. Sebelum dilakukan tahapan klasifikasi, postingan-postingan tersebut dilakukan preprocessing dan pemberian bobot kata pada teks menggunakan TF-IDF. Hasil TF-IDF dinyatakan dalam bentuk vektor-vektor, proses ini disebeut dengan proses vektorisasi. Banyaknya dokumen dari data hasil vektorisasi TF-IDF yang digunakan yaitu sebanyak 228, dengan proporsi mahasiswa lulus dan tidak lulus secara berturut-turut, yaitu sebesar 219 dan 9. Pada data tersebut didominasi oleh mahasiswa lulus, artinya data tersebut tidak seimbang, sehingga diperlukan proses SMOTE untuk menyeimbangkan data. Kemudian, dilakukan implementasi model MNB pada 3 kasus pembagian data training dan data testing, yaitu 70%;30%, 80%:20% dan 90%:10%, dengan cara melatih model pada data training dan menguji model pada data testing untuk memperoleh klasifikasi performanya. Implementasi dilakukan sebanyak lima kali percobaan, sehingga didapatkan model MNB dapat mengklasifikasi performa mahasiswa dengan baik dan hasil kinerja model terbaik pada data testing 30% yaitu rata-rata akurasi sebesar 0,956, rata-rata recall sebesar 0,979, dan rata-rata f1-score sebesar 0,977. Namun rata-rata presisi terbaik didapatkan pada data testing 20%, yaitu sebesar 0,977.

E-Learning Management System (EMAS) is an application created by the University of Indonesia with various features, one of which is an online discussion forum. In online discussion forums, students can make posts in the form of text to be able to discuss. Posts in the form of text have an important role in improving student performance, especially at graduation. In this final project, Multinomial Naive Bayes (MNB) is used to classify student performance based on posts in text form on online discussion forums. Prior to the classification stage, the posts were preprocessed and assigned word weights to the text using TF-IDF. The results of TF-IDF are expressed in the form of vectors, this process is called the vectorization process. The number of documents from the TF-IDF vectorized data used is 228, with the proportion of students graduating and not graduating respectively, which is 219 and 9. SMOTE to balance data. Then, the implementation of the MNB model was carried out in 3 cases of distribution of training data and testing data, namely 70%; 30%, 80%:20% and 90%:10%, by training the model on the training data and testing the model on the testing data to obtain performance classification. The implementation was carried out five times, so that the MNB model was able to classify student performance well and the best model performance results were on 30% testing data, namely an average accuracy of 0.956, an average recall of 0.979, and an average f1-score of 0.956. 0.977. However, the best average precision was obtained at 20% testing data, which was 0.977."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2022

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Camera-based document analysis and recognition : 4th International Workshop, CBDAR 2011, Beijing, China, September 22, 2011, revised selected papers

"This book constitutes the thoroughly refereed post-workshop-proceedings of the 4th International Workshop on Camera-Based Document Analysis and Recognition, CBDAR 2011, held in Beijing, China, in September 2011. The 13 revised full papers presented were carefully selected during a second round of reviewing and improvement from numerous original submissions. Intended to give a snapshot of the state-of-the-art research in the field of camera based document analysis and recognition, the papers are organized in topical sections on text detection and recognition in scene images, camera-based systems, and datasets and evaluation

Berlin: Springer-Verlag , 2012

e20406345

eBooks Universitas Indonesia Library

Liliana Calderon-Benavides, editor

String processing and information retrieval: 19th International Symposium, SPIRE 2012, Cartagena de Indias, Colombia, October 21-25, 2012 : proceedings

"This book constitutes the refereed proceedings of the 19th International Symposium on String Processing and Information Retrieval, SPIRE 2012, held in Cartagena de Indias, Colombia, in October 2012. The 26 full papers, 13 short papers, and 3 keynote speeches were carefully reviewed and selected from 81 submissions. The following topics are covered, fundamentals algorithms in string processing and information retrieval, SP and IR techniques as applied to areas such as computational biology, DNA sequencing, and Web mining."

Berlin: Springer, 2012

e20407281

eBooks Universitas Indonesia Library

Tucker, Allen B.

Text processing: algorithms, languanges, and applications

New York: Academic Press, 1979

410 TUC t

Buku Teks SO Universitas Indonesia Library

Information Retrieval

"Machine Learning (ML) algorithms have opened up new possibilities

for the acquisition and processing of documents in Information

Retrieval (IR) systems. Indeed, it is now possible to automate several

labor-intensive tasks related to documents such as categorization and

entity extraction. Consequently, the application of machine learning techniques

for various large-scale IR tasks has gathered significant research

interest in both the ML and IR communities. This tutorial provides a

reference summary of our research in applying machine learning techniques

to diverse tasks in Digital Libraries (DL). Digital library portals

are specialized IR systems that work on collections of documents

related to particular domains. We focus on open-access, scientific digital

libraries such as CiteSeerx, which involve several crawling, ranking,

content analysis, and metadata extraction tasks. We elaborate on the

challenges involved in these tasks and highlight how machine learning

methods can successfully address these challenges."

Switzerland: Springer International Publishing, 2015

e20528522

eBooks Universitas Indonesia Library

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian