vLLM Office Hours - June 20, 2024

642 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

vLLM Office Hours - June 20, 2024

Happy one-year anniversary vLLM! In this session, we covered what's new in vLLM v0.5.5, including FP8 weights and activations, speculative decoding, and OpenAI Vision API support. We dug deeper into various topics, including new quantization kernels, GPU architecture compatibility, embeddings in the OpenAI API, optimization tips for GPTQ configurations, and handling concurrent requests in the API server. For more details, you can access the session slides here: https://docs.google.com/presentation/d/1BAGbJ-aGYrAMUugReF758u5JUT9EAJLn
 
Sign up for bi-weekly vLLM office hours: https://hubs.li/Q02Y5Pbh0					

vLLM Office Hours - June 20, 2024

Nhạc Theo Chủ Đề

Liên kết website