How to align your LLM judge for better evaluations

How to align your LLM judge for better evaluations

753 Lượt nghe
How to align your LLM judge for better evaluations
In this webinar, we will cover how to build LLM applications correctly with a special focus on aligning LLM judge. While creating a RAG pipeline is now straightforward, aligning your LLM Judge for accurate evaluation remains a complex challenge. In this webinar, we’ll dive into various evaluation strategies with a special focus on aligning your LLM Judge. Using a RAG pipeline as a case study, we’ll demonstrate how to build an effective evaluation system with LlamaIndex and leverage W&B Weave for systematic assessment and annotation. Chapters: 0:00 Introduction and overview 1:36 Importance of evaluation in LLM Applications 3:33 Frameworks: LlamaIndex vs LangChain 4:14 Weights & Biases in LLM Ops 5:01 What is RAG? 7:01 Components of a RAG pipeline 16:44 Demo time 20:00 Building the retriever and query engine 27:01 Integrating Weave and viewing Traces 30:53 Customized evaluation and comparison 47:04 Final thoughts and summary Join us as we explore the entire evaluation lifecycle and uncover best practices.