Run A Local LLM Across Multiple Computers! (vLLM Distributed Inference)

15.120 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

Run A Local LLM Across Multiple Computers! (vLLM Distributed Inference)

Timestamps:

00:00 - Intro
01:24 - Technical Demo
09:48 - Results
11:02 - Intermission
11:57 - Considerations
15:48 - Conclusion

In this video, we explore distributed inference using vLLM and Ray. To demonstrate this exciting functionality, we set up two nodes: one equipped with two RTX 3090 Ti GPUs and another with two RTX 3060 GPUs. After configuring the nodes, we test distributed inference by loading a model across both nodes, enabling interaction with a fully distributed inference setup.

Join us as we dive into the technical details, share results, and discuss considerations for using distributed inference in your own projects!					

Run A Local LLM Across Multiple Computers! (vLLM Distributed Inference)

Nhạc Theo Chủ Đề

Liên kết website