LLaDA - Large Language Diffusion Models (paper explained)

LLaDA - Large Language Diffusion Models (paper explained)

4.924 Lượt nghe
LLaDA - Large Language Diffusion Models (paper explained)
LLaDA - Large Language Diffusion Models (paper explained) Diffusion Models are catching up big time for language tasks. In particular, I came across this interesting paper called, "Large language Diffusion Models or LLaDA in short). While traditionally LLMs have been tacked in an auto-regressive way, Diffusion models flip them around the head and tackle them all-in-one-go style. So, given their computational speed, are Diffusion the future of LLMs? ⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️ 0:00 - Intro 1:23 - Motivation 1:51 - Autoregressive VS Diffusion 4:17 - Pre-training 4:52 - Supervised Fine-tuning 5:24 - Inference 6:51 - Experiments and Results AI BITES KEY LINKS Website: https://www.ai-bites.net YouTube: https://www.youtube.com/@AIBites Twitter: https://twitter.com/ai_bites​ Patreon: https://www.patreon.com/ai_bites​ Github: https://github.com/ai-bites​