LLaDA - Large Language Diffusion Models (paper explained)

4.924 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

LLaDA - Large Language Diffusion Models (paper explained)

LLaDA - Large Language Diffusion Models (paper explained)
Diffusion Models are catching up big time for language tasks. In particular, I came across this interesting paper called, "Large language Diffusion Models or LLaDA in short). 
While traditionally LLMs have been tacked in an auto-regressive way, Diffusion models flip them around the head and tackle them all-in-one-go style. 
So, given their computational speed, are Diffusion the future of LLMs?

⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️ 
0:00 - Intro 
1:23 - Motivation 
1:51 - Autoregressive VS Diffusion
4:17 - Pre-training
4:52 - Supervised Fine-tuning
5:24 - Inference
6:51 - Experiments and Results

AI BITES KEY LINKS
Website: https://www.ai-bites.net
YouTube: https://www.youtube.com/@AIBites
Twitter: https://twitter.com/ai_bites​
Patreon: https://www.patreon.com/ai_bites​
Github: https://github.com/ai-bites​					

LLaDA - Large Language Diffusion Models (paper explained)

Nhạc Theo Chủ Đề

Liên kết website

LLaDA - Large Language Diffusion Models (paper explained)

Những bài liên quan

Chưa có bài liên quan nào!