Hasil Pencarian

Ditemukan 2 dokumen yang sesuai dengan query

Miftahur Roziqiin

Simulasi Sistem Pengendalian Ketinggian dan Temperatur Air pada Proses Water Thermal Mixing Menggunakan Reinforcement Learning dengan Algoritma Soft Actor-Critic = Simulation of Water Level and Temperature Control System in Water Thermal Mixing Process Using Reinforcement Learning with Soft Actor-Critic Algorithm

"Sistem pengendalian merupakan suatu sistem yang banyak ditemukan dan berhubungan dengan beragam jenis proses yang ada pada berbagai bidang, terutama bidang industri. Proses pengendalian yang umum ditemukan dalam industri adalah proses thermal mixing. Salah satu contoh proses thermal mixing yang cukup sederhana adalah proses pencampuran air panas dan air dingin atau water thermal mixing, dengan tujuan untuk mencapai temperatur campuran yang diinginkan, tetapi tetap menjaga ketinggian air agar tidak melebihi kapasitas wadah. Nilai temperatur tersebut dapat dicapai dengan cara mengatur debit aliran air yang masuk ke dalam wadah pencampuran. Pada penelitian ini, diimplementasikan sistem pengendalian menggunakan Reinforcement Learning dengan algoritma Soft Actor-Critic pada simulasi pengendalian ketinggian dan temperatur air pada proses water thermal mixing menggunakan Simulink pada MATLAB. Agent dilatih agar dapat mengendalikan sistem secara cepat dan tepat dalam menentukan action berupa nilai untuk mengatur valve menghasilkan debit aliran air yang diperlukan. Hasil dari penelitian ini menunjukkan bahwa algoritma SAC dapat digunakan untuk mengendalikan sistem dengan baik, dengan nilai overshoot terbesar yaitu 1.33% untuk pengendalian ketinggian air dan steady-state error terbesar yaitu 0.33℃ saat mengendalikan temperatur campuran, dan nilai settling time terbesar yaitu 160 sekon saat terjadi perubahan set point untuk ketinggian air dari 2.5 dm menjadi 5 dm, serta mampu mengendalikan kestabilan sistem ketika mengalami gangguan dalam waktu 93 sekon.

The control system is a system that is widely found and relates to various types of processes that exist in various sector, especially the industrial sector. The control process commonly found in industry is the thermal mixing. One of the thermal mixing processes is the process of mixing hot and cold water or water thermal mixing, with the aim of reaching the desired temperature, but still maintaining the water level, so that it does not exceed the capacity of the container. This temperature value can be reached by adjusting the flow of water entering the mixing container. In this study, a control system was implemented using Reinforcement Learning with Soft Actor-Critic algorithm on a simulation of controlling water level and temperature in the water thermal mixing using Simulink in MATLAB. Agents are trained to be able to control the system quickly and precisely in determining the action in the form of a value to adjust the valve to produce the required water flow rate. The results of this study indicate that the SAC algorithm can be used to control the system properly, with the biggest overshoot of 1.33% for controlling water level and steady-state error of 0.33℃ when controlling the temperature of the mixture, and the settling time of 160 seconds when the set point value change for the water level from 2.5 dm to 5 dm, as well as being able to control the stability of the system when experiencing disturbances within 93 seconds."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2022

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Filipus Heryanto

Implementasi metode Actor Critic using Kronecker-Factored Trust Region (ACKTR) pada perdagangan sekuritas = Implementation of Actor Critic using Kronecker-Factored Trust Region (ACKTR) in securities trading / Filipus Heryanto

"ABSTRAK

Dalam perdagangan sekuritas, terdapat masalah keputusan dalam pengelolaan portofolio. Keputusan ini dapat ditentukan dengan reinforcement learning. Reinforcement learning bertujuan untuk mengoptimalkan cumulative reward (keuntungan kumulatif), dengan policy (kebijakan) yang memilih tindakan tertentu yang memberikan keuntungan yang lebih baik. Cumulative reward menggunakan discount rate yang mempengaruhi pertimbangan reward di masa depan. Pada skripsi ini, digunakan Actor Critic using Kronecker-Factored Trust Region (ACKTR) untuk masalah keputusan. Algoritma ini menggunakan model Actor-Critic, natural gradient descent, dan trust region optimization. Model Actor-Critic terdiri atas Actor, dan Critic, dimana Critic mengevaluasi cumulative reward (keuntungan kumulatif), dan Actor melakukan tindakan untuk mendapatkan reward (keuntungan). Natural gradient descent merupakan perkembangan gradient descent yang merepresentasikan steepest descent, dan digunakan untuk memeningkatkan efisiensi sampel. ACKTR memanfaatkan Kronecker-Factored Approximated Curvature (K-FAC) sebagai aproksimasi untuk natural gradient descent, dan trust region untuk memberikan minimum update pada backpropagation. Pada reinforcement learning, agen berinteraksi dengan lingkungan berdasarkan skema Markov Decision Process (MDP), yang mendeskripsikan permasalahan. Pada skripsi ini, agen bertujuan untuk mengoptimalkan keuntungan pada MDP personal retirement portfolio dengan discount rate yang berbeda, dan hasil pembelajaran dari ACKTR akan dianalisis.

ABSTRACT

There are various decision problems in portfolio management. Reinforcement learning can be used to solve decision problems. Reinforcement learning optimizes cumulative reward with policy, which chooses specific actions for a better reward. Cumulative reward has a discount rate that influences reward in the future. In this study, Actor Critic Using Kronecker-Factored Trust Region (ACKTR) is used to solve a decision problem. This algorithm adopts Actor-Critic model, natural gradient descent and trust region optimization. Actor-Critic model composed of Actor and Critic, where Critic evaluates cumulative reward obtained, and Actor outputs action for a reward. Natural gradient descent is a modification from gradient descent that gives steepest descent and is used to improves sample eficiency. ACKTR uses Kronecker-Factored Approximated Curvature (K-FAC) to approximate natural gradient. Trust region update keeps a minimum update for backpropagation. In reinforcement learning, agent interacts with environment based on Markov Decision Process (MDP), which describes the problem. In this study, the agent needed to optimize reward in personal retirement portfolio with different discount rates and learning results from the ACKTR will be analyzed."

Universitas Indonesia, 2019

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian