Hasil Pencarian  ::  Simpan CSV :: Kembali

Hasil Pencarian

Ditemukan 6 dokumen yang sesuai dengan query
cover
Kirk, David B., 1960-
Abstrak :
This best-selling guide to CUDA and GPU parallel programming has been revised with more parallel programming examples, commonly-used libraries, and explanations of the latest tools. With these improvements, the book retains its concise, intuitive, practical approach based on years of road-testing in the authors' own parallel computing courses. "Programming Massively Parallel Processors: A Hands-on Approach" shows both student and professional alike the basic concepts of parallel programming and GPU architecture. Various techniques for constructing parallel programs are explored in detail. Case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. Updates in this edition include: new coverage of CUDA 4.0, improved performance, enhanced development tools, increased hardware support, and more; increased coverage of related technology OpenCL and new material on algorithm patterns, GPU clusters, host programming, and data parallelism; and two new case studies explore the latest applications of CUDA and GPUs for scientific research and high-performance computing.
Waltham, MA: Morgan Kaufmann, 2013
004.35 KIR p
Buku Teks  Universitas Indonesia Library
cover
Heru Suhartanto
Abstrak :
ABSTRACT
Molecular Dynamics (MD) is one of processes that requires High Performance Computing environments to complete its jobs. In the preparation of virtual screening experiments, MD is one of the important processes particularly for tropical countries in searching for anti-Malaria drugs. The search for anti-Malaria has previously conducted, for example by WISDOM project utilizing 1,700 CPUS. This computing infrastructure will be one of the limitation for country like Indonesia that also needs in silico anti malaria compounds searching from the country medical plants. Thus finding suitable and affordable computing environment is very important. Our previous works showed that our dedicated Cluster computing power with 16 cores performance better than those using fewer cores, however the GPU GTX family computing power is much better. In this study, we investigate further our previous experiment in finding more suitable computing environment on much better hardware specification of non dedicated Cluster computing and GPU Tesla. We used two computing environments, the first one is Barrine HPC Cluster of The University of Queensland which has 384 compute nodes with 3144 computing cores. The second one is Delta Future Grid GPU Cluster which has 16 computing nodes with 192 computing cores, each nodes equipped with 2 NVIDIA Tesla C2070 GPU (448 cores). The results show that running the experiment on a dedicated computing power is much better than that on non dedicated ones, and the GPU performance is still much better than that of Cluster.
2015
MK-Pdf
Artikel Jurnal  Universitas Indonesia Library
cover
Wiwien Widyastuti
Abstrak :
ABSTRACT
This research trained Deep Convolutional Networks(ConvNets) to classify hand-written Pallava alphabet. The Deep ConvNets architecture consists of two convolutional layers, each followed by maxpooling layer, two Fully-Connected layers. It had 442.602 parameters. This model classified 660 images of hand-written Pallava alphabet into 33 diferent classes. To make training faster, this research used GPU implementation with 384 CUDA cores. Two different techniques were implemented, Stochastic Gradient Descent (SGD) and Adaptive Gradient, each trained with 10, 20, 30 and 40 epoch. The best accuracy was 67,5%, achieved by the model with SGD technique trained at 30 epoch.
Yogyakarta: Media Teknika, 2017
620 MT 12:2 (2017)
Artikel Jurnal  Universitas Indonesia Library
cover
Heru Suhartanto
Abstrak :
The invention of graphical processing units (GPUs) has significantly improved the speed of long processes used in molecular dynamics (MD) to search for drug candidates to treat diseases, such as malaria. Previous work using a single GTX GPU showed considerable improvement compared to GPUs run in a cluster environment. In the current work, AMBER and dual GTX 780 and 970 GPUs were used to run an MD simulation on the Plasmodium falciparum enoyl-acyl carrier protein reductase enzyme; the results showed that performance was improved, particularly for molecules with a large number of atoms using single GPU.
Depok: Faculty of Engineering, Universitas Indonesia, 2018
UI-IJTECH 9:1 (2018)
Artikel Jurnal  Universitas Indonesia Library
cover
Alan Novaldi
Abstrak :
Sistem lampu lalu lintas cerdas merupakan sistem yang dapat melakukan pengaturan lampu lalu lintas secara adaptif berdasarkan kondisi kepadatan lalu lintas. Salah satu cara untuk mendapatkan kondisi kepadatan lalu lintas adalah melakukan komputasi penghitungan jumlah kendaraan dari video CCTV yang terpasang pada persimpanan. Pada penelitian ini dilakukan paralelisasi program penghitungan jumlah kendaraan menggunakan modul Multiprocessing pada python untuk mendapatkan data penghitungan kendaraan dari setiap jalan di persimpangan. Selanjutnya utilisasi GPU dilakukan untuk mendapatkan data secara real time dari suatu komputasi berat video processing. Pada penelitian ini, utilisasi GPU dilakukan dengan menggunakan CUDA sebagai platform yang dapat menghubungkan program dengan GPU pada low-level. Pengelolaan utilisasi GPU pada high-level dilakukan menggunakan TensorFlow yang sudah terintegrasi dengan CUDA. Uji coba eksekusi program dilakukan untuk mendapatkan runtime terbaik dari eksekusi program. Komputasi secara paralel menghasilkan runtime eksekusi komputasi 1.6 kali lebih cepat jika dibandingkan dengan komputasi secara sekuensial. Pada tingkat utilisasi GPU yang lebih optimal, runtime eksekusi komputasi dapat ditingkatkan hingga 2 kali lebih cepat dari komputasi normal. Utilisasi GPU juga terbukti meningkatkan runtime eksekusi program karena komputasi utama video processing tidak lagi dijalankan menggunakan CPU. Hasil uji eksekusi komputasi digunakan untuk membuat visualisasi data penghitung jumlah kendaraan. Visualisasi ini dilakukan agar data yang penghitungan dapat diproses lebih lanjut untuk sistem pengatur lampu lalu lintas. Pada akhir penelitian dilakukan profiling performa GPU menggunakan Nvprof dan NVIDIA Visual Profiler sebagai tools yang disediakan oleh CUDA. Hasil profiling menunjukkan analisis yang menyatakan bahwa tingkat penggunaan GPU untuk komputasi masih belum secara maksimal dilakukan. Hal ini terbukti dari rendahnya angka compute utilization, average throughput dan kernel concurency dari eksekusi program. Sehingga diperlukan adanya optimisasi program penghitungan kendaraan agar utilisasi GPU lebih optimal. ......Traffic light intelligence system is an adaptive system which able to control traffic flow on road intersection based on traffic condition. Traffic density information can be obtained from vehicle counting computation using deep learning methodology on CCTV record video data of a road intersection. This study performed parallelization of the vehicle counting computation using the Multiprocessing module in python to get the number of vehicles approaching the intersection. GPU Utilization is performed to obtain vehicle counting data in real time from a heavy computation like video-processing. GPU utilization is carried out using CUDA as a platform that can connect programs with GPUs at low-level architecture. GPU utilization management at high-level is done using TensorFlow which has been integrated with CUDA. Some experiments are performed to get the best runtime from program execution. Parallel computation produces runtime execution 0.6 times faster compared to sequential computation. On more GPU compute utilization optimization, parallel computation can produce runtime 2 times more compared to normal computation. GPU utilization has also been proven to increase the program execution runtime because the main computational video processing is no longer run on the CPU. The experiment result on vehicle detection used to create data visualization about vehicle counting on a road intersection. Data visualization is done so that the vehicle data can be further processed for the traffic light control system. At the end of the study GPU performance profiling was done using Nvprof and NVIDIA Visual Profiler as tools provided by CUDA. Profiling results show that analysis states that the level of GPU usage for computing is still not maximally done. This analysis is shown from the low number of compute utilization, average throughput and kernel concurrency of program execution. GPU utilization need to be optimized in order the program can run optimally on GPU.
Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2020
S-pdf
UI - Skripsi Membership  Universitas Indonesia Library
cover
Rama Widragama Putra
Abstrak :
Para penyandang tunarungu berkomunikasi menggunakan bahasa isyarat resmi di Indonesia, yaitu SIBI (Sistem Isyarat Bahasa Indonesia). Dengan menggunakan aplikasi penerjemah Bahasa isyarat ke teks akan membantu komunikasi antara tunarungu maupun non-tunarungu. Dengan menggunakan pre-trained model CPM (EdvardHua, 2018) akan mendapatkan informasi berupa titik-titik skeleton seperti titik tangan, bahu, dan siku. Informasi titik skeleton itu akan digunakan untuk memprediksi kata. Namun, proses tersebut perlu berjalan secara real-time, yaitu ketika pengguna membuka kamera maka akan langsung mendapatkan respon. Untuk mencapai itu diperlukan mobile deep learning framework, sehingga proses inference bisa menjadi lebih cepat dengan bantuan runtime GPU. Penelitian ini berfokus menjalankan inference menggunakan mobile deep learning framework untuk implementasi modul ekstraksi skeleton secara real-time pada Android. Pada penelitian ini digunakan Tensorflow mobile (runtime hanya CPU), MACE, dan SNPE. Dilakukan pengukuran dari sisi latency, penggunaan energi, penggunaan memori, penggunaan daya, dan perubahan suhu. Hasil pengukuran menunjukkan bahwa penggunaan MACE dan SNPE dengan runtime GPU menghasilkan latency yang lebih kecil dibandingkan penggunaan CPU. Penggunaan CPU menyebabkan thermal throttling, sehingga terjadi penurunan kinerja. Dengan runtime GPU menghasilkan penggunaan energi, memori, dan daya yang lebih sedikit dibandingkan CPU. Kenaikan suhu ketika menggunakan runtime GPU lebih kecil dibandingkan CPU. ......People with hearing impairments use the official sign language in Indonesia, namely SIBI (Sistem Isyarat Bahasa Indonesia). Using a sign language-to-text translator application will help the communication between people with hearing impairments and people without hearing impairment. By using the pre-trained CPM model (EdvardHua, 2018), the information in the form of skeleton points such as the points of the hands, shoulders, and elbows will be obtained. The skeleton point information will be used to predict its translation words. However, the translation process needs to be run in real- time, which is when users open their cameras then they will immediately receive a respond. To achieve that goal, we need a mobile deep learning framework, with the result that the inference process is faster with the help of the GPU runtime. This research focuses on running inferences using a mobile deep learning framework to implement real-time skeleton extraction module in Android. This research uses Tensorflow mobile (runtime only for CPU), MACE, and SNPE. Measurements of the latency, energy usage, memory usage, power usage, and temperature change were taken. The measurement results show that the use of MACE and SNPE with GPU runtime is in lower latency than with the use of CPU. Measurement with CPU usage causes thermal throttling, resulting in decreased performance. Measurement with GPU runtime results in lower usage of energy, memory and power compared to the measurement with CPU. The temperature increase when using the GPU runtime is lower than when using the CPU.
Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2020
S-pdf
UI - Skripsi Membership  Universitas Indonesia Library