Timestamps:
00:00 - Intro
00:46 - Bug Report
01:54 - Jetson Container Setup
04:12 - RAG Container Setup
06:58 - Ollama Issue
08:30 - Ollama Fix
12:16 - RAG Container
13:27 - Compatibility Issue
14:27 - Compatibility Fix
20:35 - RAG Document Setup
22:12 - Ollama Manual Pull
23:44 - RAG Container Start
26:16 - RAG Data Tweak
29:54 - RAG Demo
34:50 - RAG Settings
36:23 - RAG vs No RAG
37:39 - Closing Thoughts
In this video, we set up a Retrieval-Augmented Generation (RAG) workflow on the NVIDIA Jetson Orin Nano Super, using Ollama, LlamaIndex, and a Streamlit web app to create a fully functional Jetson container for local RAG processing.
Along the way, we tackle unexpected roadblocks, including Ollama breaking inside Jetson containers and various compatibility issues that made setup more challenging than expected. Step by step, we debug and fix these issues, ensuring the system runs smoothly on the Jetson Orin Nano Super.
Once everything is configured, we demonstrate RAG in action, running a Llama 3.2 model on a proprietary document and comparing responses with and without RAG to showcase the real-world impact of enhanced document retrieval in LLMs.
If you've ever wanted to implement RAG locally on a Jetson device, this guide will walk you through every challenge, fix, and optimization along the way.