In this episode of Google DeepMind: The Podcast, VP of Reinforcement Learning, David Silver, describes his vision for the future of AI, exploring the concept of the "era of experience" versus the current "era of human data". Using AlphaGo and AlphaZero as examples, he highlights how these systems surpassed human capabilities by engaging in reinforcement learning without prior human knowledge. This approach contrasts with large language models, which depend on human data and feedback. Silver emphasizes the need to explore this path to drive AI progress and achieve artificial superintelligence.
___
Additional resources
The Era of Experience Paper (position paper): https://goo.gle/3EiRKIH
Meta-Gradient Reinforcement Learning (2018) https://arxiv.org/abs/1805.09801
Go to Zero (S1, Google DeepMind Podcast)
https://www.youtube.com/watch?v=OkAwsrHMTgM
AlphaGo the documentary:
https://youtu.be/WXuK6gekU1Y?si=NedlhUqbV8H-ZVDn
___
Timestamps
00:00 Introduction
01:50 Era of experience
03:45 AlphaZero
10:19 Move 37
15:20 Reinforcement learning and human feedback
24:30 AlphaProof
29:50 Math Olympiads
35:00 Experience based methods
42:56 Hannah's reflections
44:00 Fan Hui joins
___
Thanks to everyone who made this possible, including but not limited to:
Presenter: Professor Hannah Fry
Series Producer: Dan Hardoon
Series Editor: Rami Tzabar
Commissioner & Producer: Emma Yousif
Music Composition: Eleni Shaw
Audio Engineer: Richard Courtice
Production Manager: Dan Lazard
Video Studio Production: Nicholas Duke
Video Director and Editor: Bernardo Resende
Video Editor: Bilal Merhi
Audio Engineer: Perry Rogantin
Camera and Lighting Operator: Robert Messere
Production Coordination: Zoey Roberts, Sarah Ellen Morton
Visual Identity and Design: Rob Ashley
Commissioned by Google DeepMind
___
Subscribe to our channel https://www.youtube.com/@googledeepmind
Find us on X https://twitter.com/GoogleDeepMind
Follow us on Instagram https://instagram.com/googledeepmind
Add us on Linkedin https://www.linkedin.com/company/deepmind/