I tried to run a 70B LLM on a MacBook Pro. It didn't go well.

I tried to run a 70B LLM on a MacBook Pro. It didn't go well.

34.153 Lượt nghe
I tried to run a 70B LLM on a MacBook Pro. It didn't go well.
Today, we're trying to load and use a 70B LLM with ollama on a 14" M4 Pro MacBook Pro with 48GB RAM. Will it work? In this video: 💻 14" MacBook Pro M4 Pro (12 cores): https://amzn.to/3ANEPwB 💨 TG Pro (CPU + GPU cores temps and fan speed): https://www.tunabellysoftware.com/tgpro/index.php?fpr=d157l 🎤 Microphone: https://amzn.to/3AFgvNw 🖱️ Mouse: https://amzn.to/3Z3pal4 ⌨️ Keyboard: https://amzn.to/3OdkjZv I tested 7 small LLMs locally to find the fastest 👇 https://www.youtube.com/watch?v=CDdo29LgoRk Models tested: - phi3:14b - 7.9GB - qwen2.5:14b - 9.0GB - gemma2:27b - 16GB - llama3.1:8b (fp16) - 16GB - qwen2.5:32b - 20GB - llama3.1:70b - 39GB 00:00 LLMs tested 00:47 Prompt used 01:05 phi3:14b 01:55 qwen2.5:14b 02:53 gemma2:27b 04:23 how to find alternative models on ollama.com 05:03 llama3.1:8b-instruct-fp16 05:58 qwen2.5:32b 07:15 hearing the fans 07:49 llama3.1:70b 08:15 memory pressure goes through the roof 09:10 fans and temperature are increasing 09:35 llama3.1:70b results 09:58 Analysis and speed considerations 11:27 Stats recap