Session 15  TD(0) convergence proof (contd), Point of Convergence of TD(0) (linear function approx.)

Session 15 TD(0) convergence proof (contd), Point of Convergence of TD(0) (linear function approx.)

21 Lượt nghe
Session 15 TD(0) convergence proof (contd), Point of Convergence of TD(0) (linear function approx.)
In this video, we begin with the update of the TD(0) algorithm with linear function approximation. Next, we find the Martingale sequence-based SDE for the SGD update of the algorithm. The main focus of the video is to prove that this algorithm converges to a globally asymptotically fixed equilibrium. We use the result from Borkar and Bhatnagar, which proved the convergence of an SDE fixed point for the corresponding ODE if a list of conditions hold -- like square integrable Martingales and Lipschitz continuity. We prove that the TD(0) algorithm does satisfy each of these conditions and thus converges to a fixed point. Next, we will see if the fixed point is the same as the one we intended to find. Materials: https://drive.google.com/drive/folders/19TeUFa1xIfy1RCAC9IIcucd1JdeNMaXQ?usp=sharing