How to Evaluate (and Improve) Your LLM Apps

2.330 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

How to Evaluate (and Improve) Your LLM Apps

Get exclusive access to AI resources and project ideas: https://the-data-entrepreneurs.kit.com/shaw

Here, I discuss 3 types of evals and how to use them to improve LLM apps.

📰 Blog: https://medium.com/@shawhin/how-to-evaluate-and-improve-your-llm-apps-f7b08fb7493c?sk=f2fbcd3f16b958baa4734d4a39d5b237
💻 Example Code: https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/evals

References
[1] https://youtu.be/XGJNo8TpuVA
[2] arXiv:2501.12948 [cs.CL]
[3] arXiv:2402.01383 [cs.CL]
[4] https://hamel.dev/blog/posts/llm-judge/
[5] arXiv:2203.02155 [cs.CL]
[6] https://youtu.be/SnbGD677_u0

--
Intro - 0:00
Vibe Checks - 0:27
Evals - 3:26
Type 1: Code-based - 5:58
Type 2: Human-based - 9:34
Type 3: LLM-based - 13:34
Example: Improving y2b with LLM Judge - 15:28

Homepage: https://www.shawhintalebi.com					

How to Evaluate (and Improve) Your LLM Apps

Nhạc Theo Chủ Đề

Liên kết website