Explore the synergy between long context models and Retrieval Augmented Generation (RAG) in this episode of the Release Notes podcast. Join Google DeepMind's Nikolay Savinov with host Logan Kilpatrick as they discuss scaling context windows into the millions, recent quality improvements, RAG versus long context, and what's next in the field.
Listen to this podcast:
Apple Podcasts → https://goo.gle/3Bm7QzQ
Spotify → https://goo.gle/3ZL3ADl
Chapters:
0:00 - Intro
0:52 Introduction & defining tokens
5:27 Context window importance
9:53 RAG vs. Long Context
14:19 Scaling beyond 2 million tokens
18:41 Long context improvements since 1.5 Pro release
23:26 Difficulty of attending to the whole context
28:37 Evaluating long context: beyond needle-in-a-haystack
33:41 Integrating long context research
34:57 Reasoning and long outputs
40:54 Tips for using long context
48:51 The future of long context: near-perfect recall and cost reduction
54:42 The role of infrastructure
56:15 Long-context and agents
Subscribe to Google for Developers → https://goo.gle/developers
Speaker: Nikolay Savinov, Logan Kilpatrick
Products Mentioned: Google AI,