Lets start our first Model-Based Reinforcement Learning method using Policy Iteration! Please make sure you are comfortable with the Bellman Equation though! Here are some of my previous videos to help:
Derive the Bellman Equation
https://youtu.be/4YXM7vEuR5c
Bellman Optimality Equations
https://youtu.be/OmdMnU4BCn0
Timestamps:
00:00:00 - Two Methods to Solve Environment
00:03:35 - Policy Evaluation
00:04:35 - Policy Improvement
00:06:00 - How to use Gymnasium Package
00:17:35 - Implement Deterministic Policy Evaluation
00:26:17 - Implement Deterministic Policy Improvement
00:34:10 - Play Game!
00:35:00 - Convert to Stochastic Environment
00:46:05 - Play Game!
00:48:36 - Recap
Socials!
X https://twitter.com/data_adventurer
Instagram https://www.instagram.com/nixielights/
Linkedin https://www.linkedin.com/in/priyammaz/
🚀 Github: https://github.com/priyammaz
🌐 Website: https://www.priyammazumdar.com/