L3 Policy Gradients and Advantage Estimation (Foundations of Deep RL Series)

L3 Policy Gradients and Advantage Estimation (Foundations of Deep RL Series)

36.704 Lượt nghe
L3 Policy Gradients and Advantage Estimation (Foundations of Deep RL Series)
Lecture 3 of a 6-lecture series on the Foundations of Deep RL Topic: Policy Gradients and Advantage Estimation Instructor: Pieter Abbeel Slides: https://www.dropbox.com/s/7y82w1q70ftt2fv/l3-policy-gradient-and-advantage-estimation.pdf?dl=0