Multi-Modal RAG: Chat with Text and Images in Documents

Multi-Modal RAG: Chat with Text and Images in Documents

17.463 Lượt nghe
Multi-Modal RAG: Chat with Text and Images in Documents
In this video, I'll show you how to build an end-to-end multi-modal RAG system using GPT-4 and LLAMA Index. We'll cover data collection, creating vector stores for text and images, and building a retrieval pipeline. Perfect for those interested in enhancing large language models with multi-modal data. LINKS: Colabl: https://tinyurl.com/25sb2rtu Architecture: https://tinyurl.com/4x9x9bsc Multi-modal RAG - Previous Video: https://youtu.be/Rg35oYuus-w 💻 RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag Let's Connect: 🦾 Discord: https://discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: https://ko-fi.com/promptengineering |🔴 Patreon: https://www.patreon.com/PromptEngineering 💼Consulting: https://calendly.com/engineerprompt/consulting-call 📧 Business Contact: [email protected] Become Member: http://tinyurl.com/y5h28s6h 💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off). Signup for Newsletter, localgpt: https://tally.so/r/3y9bb0 TIMESTAMPS: 00:00 Introduction to Multi-Modal RAG Systems 00:23 Overview of the Architecture 02:57 Setting Up the Environment 03:54 Data Collection and Preparation 04:28 Generating Image Descriptions with GPT-4 08:10 Creating Multi-Modal Vector Stores 09:41 Implementing the Retrieval Pipeline 11:05 Generating Final Responses All Interesting Videos: Everything LangChain: https://www.youtube.com/playlist?list=PLVEEucA9MYhOu89CX8H3MBZqayTbcCTMr Everything LLM: https://youtube.com/playlist?list=PLVEEucA9MYhNF5-zeb4Iw2Nl1OKTH-Txw Everything Midjourney: https://youtube.com/playlist?list=PLVEEucA9MYhMdrdHZtFeEebl20LPkaSmw AI Image Generation: https://youtube.com/playlist?list=PLVEEucA9MYhPVgYazU5hx6emMXtargd4z