Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ)

30.350 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ)

In this tutorial, we will explore many different methods for loading in pre-quantized models, such as Zephyr 7B. We will explore the three common methods for quantization, GPTQ, GGUF (formerly GGML), and AWQ.

Timeline
0:00 Introduction
0:25 Loading Zephyr 7B
3:25 Quantization
7:42 Pre-quantized LLMs
8:42 GPTQ
10:29 GGUF
12:22 AWQ
14:46 Outro

📒 Google Colab notebook https://colab.research.google.com/drive/1rt318Ew-5dDw21YZx2zK2vnxbsuDAchH?usp=sharing
🛠️ Written version of this tutorial https://maartengrootendorst.substack.com/p/which-quantization-method-is-right
🤗 Zephyr 7B on HuggingFace https://huggingface.co/HuggingFaceH4/zephyr-7b-beta

Support my work:
👪 Join as a Channel Member:
 / @maartengrootendorst
✉️ Newsletter https://maartengrootendorst.substack.com/
📖 Join Medium to Read my Blogs https://medium.com/@maartengrootendorst

I'm writing a book!
📚 Hands-On Large Language Models https://www.oreilly.com/library/view/hands-on-large-language/9781098150952/

#datascience #machinelearning #ai					

Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ)

Nhạc Theo Chủ Đề

Liên kết website