Hasil Pencarian

Ditemukan 160407 dokumen yang sesuai dengan query

Hanif Arkan Audah

Perbandingan Metode Pemeriksa Ejaan antara SymSpell dan Kombinasi Damerau-Levenshtein Distance dengan Struktur Data Trie = A Spell Checker Method Comparison Between SymSpell and a Combination of Damerau-Levenshtein Distance With the Trie Data Structure

"Non-word error merupakan kesalahan ejaan yang menghasilkan kata yang tidak ada dalam kamus. Tujuan dari penelitian ini adalah membandingkan dua metode pemeriksa ejaan non-word error, yaitu SymSpell dan kombinasi Damerau-Levenshtein distance dengan struktur data trie. Kedua metode tersebut melakukan isolated-word error correction terhadap non-word error. Dalam implementasi, SymSpell dibedakan menjadi dua, yaitu weighted dan unweighted. Proses perbandingan metode dimulai dengan penyusunan kamus menggunakan entri kata dari KBBI V yang diperkaya dengan kata-kata tambahan dari Wiktionary. Kamus yang dihasilkan memuat 91.557 kata. Selanjutnya, disusun dataset uji yang dibuat secara sintetis dengan memanfaatkan modifikasi dari candidate generation Peter Norvig. Dataset uji sintetis yang dihasilkan memuat 58.532 kata salah eja. Dilakukan perbandingan antara Weighted SymSpell, Unweighted SymSpell, dan kombinasi Damerau-Levenshtein distance dengan struktur data trie menggunakan dataset uji sintetis tersebut. Perbandingan tersebut mengukur best match accuracy, candidate accuracy, dan run time. Hasil perbandingan menyimpulkan bahwa SymSpell memiliki performa yang lebih baik dibandingkan dengan metode kombinasi Damerau-Levenshtein distance dan struktur data trie karena unggul dari aspek best match accuracy dan run time serta memperoleh candidate accuracy yang setara dengan metode-metode lain. Implementasi SymSpell yang unggul, yaitu Weighted SymSpell memperoleh best match accuracy 66,79%, candidate accuracy 99,33%, dan run time 0,39 ms per kata.

Non-word errors are errors during writing where the resulting word does not exist in the dictionary. The objective is to compare non-word error spell checker methods, which are SymSpell and a combination of Damerau-Levenshtein distance with the trie data structure. Both methods handle non-word errors using isolated-word error correction.
During implementation, SymSpell is divided into two types: weighted and unweighted.
The comparison process starts by compiling a dictionary from word entries in KBBI V and Wiktionary. The resulting dictionary contains 91,557 words. The next step
is to synthetically generate a test dataset using a modified version of Peter Norvig’s candidate generation method. The resulting test dataset contains 58,532 misspellings.
A comparison is made between Weighted SymSpell, Unweighted SymSpell, and a
combination of Damerau-Levenshtein distance with the trie data structure using the synthetic test dataset that was generated. The comparison measures the best match accuracy, candidate accuracy, and run time. The results found that SymSpell performed better than the method that used a combination of Damerau-Levenshtein distance with the trie data structure because it obtained a higher best match accuracy, lower run time, and
an equivalent candidate accuracy compared to the other methods. The best performing
SymSpell implementation is Weighted SymSpell which obtained a best match accuracy of 66.79%, candidate accuracy of 99.33%, and a run time of 0.39 ms per word."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2023

TA-pdf

UI - Tugas Akhir Universitas Indonesia Library

Muhamad Sean

Analisis peformansi fungsi pengoreksi kesalahan penulisan kata pada kalimat dengan menggunakan Algoritma Jaro Winkler distance pada Simple-O = Performance analysis of spelling error correction function using Jaro Winkler distance algorithm in Simple-O

"Pada skripsi ini akan dibahas bagaimana pengujian dan analisis performansi dari program pengoreksi kesalahan penulisan kata pada kalimat dengan menggunakan algoritma jaro winkler distance pada SIMPLE-O. Di dalam penggunaan word processor sering ditemukan kesalahan pengetikan suatu kata. Kesalahan yang terjadi adalah ketika kata yang diketik memiliki struktur huruf yang salah sehingga mengakibatkan kata yang diketik tidak mengandung arti yang sebenarnya. Kesalahan penulisan tersebut dapat berakibat berkurangnya nilai akhir yang didapatkan jika dilakukan penilaian dengan menggunakan program penilai otomatis SIMPLE-O. Algoritma Jaro Winkler Distance merupakan sebuah variant algoritma Jaro Distance metric yang bisa digunakan untuk menganalisa kesamaan antara dua string. Dengan menggunakan algoritma tersebut, sebuah string dapat dicari kemiripan struktur hurufnya dengan membandingkannya dengan string yang lain. Algoritma Jaro Winkler Distance melakukan beberapa tahapan proses dalam mencari kesamaan antara dua string yaitu menghitung panjang string, mencari jumlah huruf yang sama dan menentukan jumlah transposisi. Presentase keefektifan program pengoreksi kesalahan penulisan kata dalam mengoreksi kata yang salah pada jawaban user adalah sebesar 83,63 %.

This thesis will mainly discuss about testing and analyzing the performance of word error correction system in sentences by using jaro winkler distance algorithm in SIMPLE-O. In applying word processor, we often find an error of typing. The error occurred when the word which is typed has no meaning. This typing error could decrease the obtained final result if it is done by using SIMPLE-O automatic grader system. The algorithm of Jaro Winkler distance is a variant of Jaro distance algorithm metrics which can be used to analyze the similarity between two strings. By using the algorithm, a string structural similarity can be found by comparing it with another string. Jaro Winkler distance algorithm performs several stages in the process of finding the similarities between two strings those are calculate the string length, search the same number of letters and determine the amount of transposition. The effectiveness precentage of typo correction program in correction the user’s answer mistakes is 83,63%."

Depok: Fakultas Teknik Universitas Indonesia, 2015

S60072

UI - Skripsi Membership Universitas Indonesia Library

Poynter, Dan

Word processors and information processing : a basic manual on what they are and how to buy / Dan Poynter

Santa Barbara: Para Publishing, 1982

652 POY w

Buku Teks SO Universitas Indonesia Library

Mukhlizar Nirwan Samsuri

Perbandingan Penggunaan Kamus Terdistribusi, Partition Around Medoids (PAM) dan Struktur Data Trie dalam Perbaikan Ejaan Otomatis Pada Teks Formal Bahasa Indonesia = A Comparison of Distributed, PAM, and Trie Data Structure Dictionaries in Automatic Spelling Correction for Indonesian Formal Text

"Kesalahan ejaan dapat dibagi menjadi dua jenis, non-word errors dan real-word errors. Non-word errors adalah kesalahan eja yang tidak terdapat dalam kamus, sedangkan real-word errors adalah kata yang terdapat pada kamus tetapi berada pada tempat yang tidak tepat pada kalimat. penelitian ini berfokus pada koreksi ejaan untuk non-word errors pada teks formal Bahasa Indonesia. Tujuan dari penelitian ini adalah untuk membandingkan efektivitas tiga jenis struktur kamus untuk koreksi ejaan, antara lain kamus terdistribusi, kamus PAM (Partition Around Medoids), dan kamus menggunakan struktur data trie. Ketiga jenis kamus juga akan dibandingkan dengan kamus sederhana yang dijadikan sebagai baseline. Tahap pengurutan kandidat (ranking correction candidates) dilakukan dengan menggunakan dua variasi dari edit distance, yaitu Levenshtein dan Damerau-Levenshtein dan n-gram. Guna mendukung penelitian ini, dibangun dataset gold standard dari 200 kalimat yang terdiri dari 4.323 token dengan 288 di antaranya adalah non-word errors. Berdasarkan kombinasi tipe kamus dan edit distance, didapatkan hasil bahwa struktur data trie dengan Damerau-Levenshtein distance memperoleh accuracy terbaik untuk menghasilkan kandidat koreksi, yaitu 95,89% dalam 45,31 detik. Selanjutnya, kombinasi struktur data trie dengan Damerau-Levenshtein distance juga mendapatkan accuracy terbaik dalam memilih kandidat terbaik, yaitu 73,15%.

Spelling errors can be divided into two groups: non-word and real-word. A non-word error is a spelling error that does not exist in the dictionary, while a real-word error is a real word but not on the right place. In this work, we address the non-word errors in spelling correction for Indonesian formal text. The objective of our work is to compare the effectiveness of three kinds of dictionary structure for spelling correction, distributed dictionary, PAM (Partition Around Medoids) dictionary, and dictionary using trie data structure, with the baseline of a simple flat dictionary. We conducted experiments with two variations of edit distances, i.e. Levenshtein and Damerau-Levenshtein, and utilized n-grams for ranking correction candidates. We also build a gold standard of 200 sentences that consists of 4,323 tokens with 288 of them are non-word errors. Among the various combinations of dictionary type and edit distance, the trie data structure with Damerau-Levenshtein distance gets the best accuracy to produce candidate correction, i.e. 95.89% in 45.31 seconds. Furthermore, the combination of trie data structure with Damerau-Levenshtein distance also gets the best accuracy in choosing the best candidate, i.e. 73.15%."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2023

TA-pdf

UI - Tugas Akhir Universitas Indonesia Library

D. Edi Subroto

Transposisi dari adjektive menjadi verba dan sebaliknya dalam bahasa Jawa

"ABSTRACT

This thesis tries to describe the transposition from adjectives to verbs and vice versa in standard Javanese.In this case, each of the major word-classes (substantive, verb, adjective) is determined primarily and accordingly to a set of its morphological features which differ in the whole aspects from the others. Except for those features, a set of its syntactical valences are also identified. Adjectives and verbs in Javanese are two different word-classes. Each of them is a word-class system which covers a set of morphological categories-i.e. a series of words with identical formal features corresponding to identical semantic features which differ in the whole aspects from each other.

The verbal system is divided into two classes (class I and class II). Morphologically, class I is characterized by di-D category (passive) which is in contrast with N-D category (active-transitive), whereas class II isn't, although it has N-D category (intransitive). Structurally, there are some important differences between the two classes caused by this principal difference. Each of the classes is also separated into two parts (part A and part B). Morphologically, part B is characterized by two specific categories: maq-D 'to do D suddenly' and patin-D ?plural subject involved to do something varies in rhythm and intensities?, and semantically is characterized by "emotive-expressive" or "onomatopoeic" semantic values, whereas part A isn?t. The object being studied in this thesis is the verbal morphological proceeds whether productive or improductive which transpose adjectives in monomorphemic category into verbs (or maybe called "deadjectival, verbal categories") and the adjectival morphological procedes which transpose verbs into adjectives (or maybe called "deverbal, adjectival categories") Based on the data, we know that the great parts of the monomorphemic adjectives can be transposed into verbs class II A (none into class II B) and only some of them can be transposed into class I A (none into class I B). Most of the transpositional categories in verb II A are productive; their formal forms: N-D-i, N-D-ake; ke-D-an; di-D-i, di-D-ake; ka-D-an, ka-D-ake; -in-D-an, -in-D-ake; taq-D-i, taq-D-ake; taq-D-ane, taq-D-ne; koq-D-i, koq-D-ake; D-ana, D-na; D-I?, _D-ke?, D-in-D-an, D-in-D-ake; D-D--an; but there are some other categories which are improductive. On the other hand, all of the transpositional categories in verb I A are improductive. There are only three procedes of adjectives (-an, ke-en, -em-//-um-) which transpose verbs into adjectives. All of the transpositional categories of adjective are improductive. In this thesis, we also know that a certain word-class system is not totally transposed into the other."

1985

D123

UI - Disertasi Membership Universitas Indonesia Library

Geraard Jonathan Raf

Implementasi dan evaluasi aplikasi human activity recognition berbasis smartphone dengan menggunakan algoritma LSTM = Implementing and evaluating human activity recognition application smartphone-based using LSTM algorithm

"ABSTRAK

Human Activity Recognition merupakan sebuah teknologi yang penting karena dapat diimplementasikan dalam berbagai kebutuhan manusia sehari-hari, seperti mengenai kesehatan manusia. Tujuan dari Human Activity Recognition adalah untuk mengidentifikasi aktivitas manusia yang umum, dimana data yang diterima dapat diteliti lebih lanjut. Seiring perkembangan teknologi, keberadaan komputer dan smartphone sudah tidak dapat dipisahkan lagi dalam kehidupan dan aktivitas manusia. Perkembangan teknologi ini membuat sebuah smartphone dapat memiliki berbagai jenis sensor. Sensor-sensor yang terdapat pada smartphone dapat digunakan untuk melakukan Human Activity Recognition dengan mudah. Contoh sensor pada smartphone yang dapat digunakan untuk melakukan Human Activity Recognition adalah sensor accelerometer untuk mengukur perpindahan. Penelitian ini membuat sebuah aplikasi berbasis Android untuk membaca input dari sensor, diolah dengan library neural network Long Short-Term Memory, lalu menghasilkan output yang sesuai. Hasil output yang dimaksud adalah kondisi dari aktivitas manusia yang diteliti, yaitu kondisi berdiri, berjalan, berlari, duduk, menaiki tangga, dan menuruni tangga.

ABSTRACT

Human Activity Recognition is an important technology because it can be implemented to many human problems, such as healthcare. The main purpose for Human Activity Recognition is to recognize common, simple human activities, where the data received can be researched further. With the development of technology these days, the presence of computer and smartphone cant be removed from daily human activities. This technology development made a smartphone that has been integrated with all kind of sensors. An example of sensor that can be used to do a Human Activity Recognition are accelerometer to measure movement. This research made an Android-based application that will read input from these sensors, processed by neural network Long Short-Term Memor y library, and finally produced the intended output. The outputs are the current activity of user thats been researched on, such as standing, walking, running, sitting, walking upstair, or walking downstair."

2019

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Hult, Christine

A writr's introduction to word processing/ Jeanette Harris

California: Wadsworth, 1987

652.5 HUL w

Buku Teks SO Universitas Indonesia Library

Andi Yusuf

Perancangan dan implementasi Algoritma Dynamic Time Warp pada field Programmable Gate Array sebagai modul Feature Matching

"Pengenalan ucapan atau disebut juga speech recognition adalah suatu pengembangan teknik dan sistem yang memungkinkan perangkat system untuk menerima masukan berupa kata yang diucapkan. Teknologi ini memungkinkan suatu perangkat untuk mengenali kata yang diucapkan dengan cara merubah kata tersebut menjadi sinyal digital dan mencocokkan dengan suatu pola tertentu yang tersimpan dalam suatu perangkat. Pola tertentu yang tersimpan pada suatu perangkat sebenarnya sampel kata yang diucapkan pengguna. Salah satu algoritma yang digunakan sebagai pemodelan dasar untuk pengenalan ucapan adalah Dynamic Time Warping (DTW). DTW digunakan sebagai algoritma untuk mencocokkan pola yang dimaksud dengan mengukur dua buah sekuensial pola dalam waktu yang berbeda[7].

Dalam penelitian ini akan dibahas mengenai perancangan IC pattern matching menggunakan algoritma DTW dan diimplementasikan pada sebuah Field Programmable Gate Array (FPGA). Algoritma DTW yang digunakan merupakan pengembangan dari algoritma standar yaitu FastDTW[13]. Perancangan difokuskan pada pembuatan layout Complementary Metal Oxide Silicon (CMOS) pada skala 0,18μm dengan metode semi custom. Layout ang terbentuk baik layout untuk IC DTW maupun layout - layout gerbang logika dasar penyusun IC tersebut, dapat dilihat behavior-nya. Dengan menggunakan Computer Aided Design (CAD) Electric behavior dapat diterjemahkan dalam bahasa hardware yang dikenal dengan Very High Speed Integrated Circuit Hardware Description Language (VHSIC HDL atau VHDL). Proses verifikasi dilakukan dengan membuat prototype perangkat keras menggunakan rangkaian ADC dan FPGA Spartan-IIELC yang telah diimplementasikan VHDL dari IC DTW.

Speech recognition is also called a development of techniques and systems that enable the device system to receive input of the spoken word. This technology allowsa device to recognize words spoken in a way to change the word into a digital signal and the match with a particular pattern stored in a device. Certain patterns that are stored on a device is a spoken word sample of users. One algorithm used as a basis for modeling of speech recognition is the Dynamic Time Warping (DTW). DTW is used as an algorithm to match the pattern in question by measuring two sequential patterns in different time [7].
In this research will be discussed regarding the design of the IC pattern matching using DTW algorithm and implemented on a Field Programmable Gate Array (FPGA). DTW algorithm used is the development of a standard algorithm that is FastDTW [13]. The design focused on making the layout of Complementary Metal Oxide Silicon (CMOS) on a scale of 0.18 μm with a method of semi-custom. Formed a good layout for IC DTW and layout of the basic logic gate, we can see his behavior. By using Computer Aided Design (CAD) Electric, behavior can be translated in hardware language, known as Very High Speed Integrated Circuit Hardware Description Language (VHSIC HDL or VHDL). The verification process is done by making a prototype hardware uses a circuit of ADC and the FPGA Spartan-IIELC that have been implemented VHDL from IC DTW."

Depok: Fakultas Teknik Universitas Indonesia, 2011

T29927

UI - Tesis Open Universitas Indonesia Library

Pembentukan kata dan pemilihan kata dalam bahasa jawa

Jakarta: Pusat Bahasa, Departemen Pendidikan Nasional, 2004

499.222 IND p

Buku Teks SO Universitas Indonesia Library

Holzner, Steven

Google docs: membagi dan mengubah arsip kerja anda secara online

Jakarta: Ufuk Publishing Group , 2009

005.5 HOL gt

Buku Teks SO Universitas Indonesia Library

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian