Getting Started with Multi-Modal LLMs

4.691 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

Getting Started with Multi-Modal LLMs

Visual assistants will be an important theme in 2024 as multi-modal LLMs gain wider adoption and capabilities. We’ve released 5 new templates as an entry points to GPT-4V, Gemini, and open source models. In this video, we provide some background on multi-modal LLMs, show results from our internal evaluations using LangSmith, highlight the trade-offs between architectures for multi-modal RAG, and introduce how to use these templates to get started.

Important Links

(1) Open source multi-modal LLMs for private visual search over your photos 

https://templates.langchain.com/?integration_name=rag-multi-modal-local
https://templates.langchain.com/?integration_name=rag-multi-modal-mv-local

(2) GPT-4V or Gemini for visual RAG over slide decks 

https://templates.langchain.com/?integration_name=rag-gemini-multi-modal
https://templates.langchain.com/?integration_name=rag-chroma-multi-modal
https://templates.langchain.com/?integration_name=rag-chroma-multi-modal-multi-vector

Slides

https://docs.google.com/presentation/d/19x0dvHGhbJOOUWqvPKrECPi1yI3makcoc-8tFLj9Sos/edit#slide=id.p					

Getting Started with Multi-Modal LLMs

Nhạc Theo Chủ Đề

Liên kết website

Getting Started with Multi-Modal LLMs

Những bài liên quan

Chưa có bài liên quan nào!