In this webinar, we will cover how to build LLM applications correctly with a special focus on aligning LLM judge. While creating a RAG pipeline is now straightforward, aligning your LLM Judge for accurate evaluation remains a complex challenge. In this webinar, we’ll dive into various evaluation strategies with a special focus on aligning your LLM Judge. Using a RAG pipeline as a case study, we’ll demonstrate how to build an effective evaluation system with LlamaIndex and leverage W&B Weave for systematic assessment and annotation.
Chapters:
0:00 Introduction and overview
1:36 Importance of evaluation in LLM Applications
3:33 Frameworks: LlamaIndex vs LangChain
4:14 Weights & Biases in LLM Ops
5:01 What is RAG?
7:01 Components of a RAG pipeline
16:44 Demo time
20:00 Building the retriever and query engine
27:01 Integrating Weave and viewing Traces
30:53 Customized evaluation and comparison
47:04 Final thoughts and summary
Join us as we explore the entire evaluation lifecycle and uncover best practices.