At Ray Summit 2024, Kuntai Du from the University of Chicago and Zhuohan Li from UC Berkeley present a comprehensive update on vLLM, the open-source LLM inference and serving engine. Their talk covers the significant developments in vLLM over the past year, focusing on its growing adoption, new features, and performance improvements.
The speakers discuss the project's community growth and governance changes, providing insight into vLLM's evolving ecosystem. They conclude by outlining the roadmap for upcoming releases, offering attendees a glimpse into the future direction of this fast-growing LLM serving solution.
This presentation is particularly valuable for those interested in the latest advancements in efficient LLM deployment and serving technologies.
--
Interested in more?
- Watch the full Day 1 Keynote:
https://youtu.be/jwZHJthQvXo
- Watch the full Day 2 Keynote
https://youtu.be/Lury2ad6KG8
--
🔗 Connect with us:
- Subscribe to our YouTube channel: https://www.youtube.com/@anyscale
- Twitter: https://x.com/anyscalecompute
- LinkedIn: https://linkedin.com/company/joinanyscale/
- Website: https://www.anyscale.com