Building AI Voice Agents with Scott Stephenson - 707

Building AI Voice Agents with Scott Stephenson - 707

750 Lượt nghe
Building AI Voice Agents with Scott Stephenson - 707
Today, we're joined by Scott Stephenson, co-founder and CEO of Deepgram to discuss voice AI agents. We explore the importance of perception, understanding, and interaction and how these key components work together in building intelligent AI voice agents. We discuss the role of multimodal LLMs as well as speech-to-text and text-to-speech models in building AI voice agents, and dig into the benefits and limitations of text-based approaches to voice interactions. We dig into what’s required to deliver real-time voice interactions and the promise of closed-loop, continuously improving, federated learning agents. Finally, Scott shares practical applications of AI voice agents at Deepgram and provides an overview of their newly released agent toolkit. 🎧 / 🎥 Listen or watch the full episode on our page: https://twimlai.com/go/707. 🔔 Subscribe to our channel for more great content just like this: https://youtube.com/twimlai?sub_confirmation=1 🗣️ CONNECT WITH US! =============================== Subscribe to the TWIML AI Podcast: https://twimlai.com/podcast/twimlai/ Follow us on Twitter: https://twitter.com/twimlai Follow us on LinkedIn: https://www.linkedin.com/company/twimlai/ Join our Slack Community: https://twimlai.com/community/ Subscribe to our newsletter: https://twimlai.com/newsletter/ Want to get in touch? Send us a message: https://twimlai.com/contact/ 📖 CHAPTERS =============================== 00:00 - Introduction 1:52 - Audio models 4:37 - Gaps in audio models 8:51 - Adaptation and specialization of audio models 11:38 - Fine-tuning in audio 17:28 - Progress of continuous learning 19:57 - Approaches to building comprehensive AI companies 25:14 - Complexity of diarization 28:20 - Perception, understanding, and interaction 31:31 - Speech-to-text and text-to-speech models 37:04 - Interpretability and end-to-end training 44:27 - Future of voice AI in human-computer interaction 47:57 - Agents 48:53 - Deepgram voice AI API 52:12 - Use cases 54:27 - Agent framework 🔗 LINKS & RESOURCES =============================== Deepgram's Groundbreaking Voice Agent API Brings AI to Life - https://www.globenewswire.com/news-release/2024/09/19/2949234/0/en/Deepgram-s-Groundbreaking-Voice-Agent-API-Brings-AI-to-Life.html Deepgram's Voice Agent API - https://deepgram.com/product/voice-agent-api https://deepgram.com/ From Particle Physics to Audio AI with Scott Stephenson - 19 - https://twimlai.com/podcast/twimlai/particle-physics-to-audio-ai-scott-stephenson/ 📸 Camera: https://amzn.to/3TQ3zsg 🎙️Microphone: https://amzn.to/3t5zXeV 🚦Lights: https://amzn.to/3TQlX49 🎛️ Audio Interface: https://amzn.to/3TVFAIq 🎚️ Stream Deck: https://amzn.to/3zzm7F5