RLR 13 - Policy-based Methods

14 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

RLR 13 - Policy-based Methods

This session is dedicated to the policy-based methods of RL. These methods are well suited for the continuous domain actions, making it impossible to represent in value-based approach. Here the policy is an objective function of the parameters of the system. The goal is to find the values of the parameters such that the objective function is at its maximum. The gradient of the objective function with respect to the parameters will indicate the direction in which the parameters to be changed such that the objective function is increasing.					

RLR 13 - Policy-based Methods

Nhạc Theo Chủ Đề

Liên kết website