The challenges in using LLM-as-a-Judge - Sourabh Agrawal | Vector Space Talk #013
"Challenges associated with using LLM-as-a-Judge" with Sourabh Agrawal, CEO & Co-Founder at UpTrain AI
He spoke about LLM-based evaluations, and how to use them effectively to improve RAG-based applications.
Abstract:
Using LLMs to determine the quality of LLM applications has gained a lot of interest recently, rightly so because it is highly scalable and solves the subjective nature of human evaluations. However, building production-grade evaluations is much more complicated than prompting the LLM to act as a judge and grade the given response.
In this talk, we will cover the key techniques employed in industry + academia to define LLM-based evaluations effectively, understand associated challenges and look at what lies beyond evaluation. We will learn how these evaluations can be leveraged to improve your LLM applications.
Sourabh's Bio:
Sourabh is a 2X founder in the AI/ML space. He started his career at Goldman Sachs, building ML models for financial markets. Post that, he joined the autonomous driving team at Bosch/Mercedes, building state-of-the-art CV modules for scene understanding. He started his entrepreneurial journey in 2020 and founded an AI-powered fitness startup that he scaled to 150K+ users. During his past experiences, he encountered frequent frustration due to the lack of tools to evaluate these models- a problem even more pronounced in the case of Generative AI models.
To solve this, he is building UpTrain - an open-source LLMOps tool to evaluate, prompt test, and monitor LLM applications. The tool gives scores and helps make your LLM applications better- we perform root-cause analysis to figure out which part of your LLM pipeline is failing, find common patterns amongst failing cases and finally give automated suggestions on how to resolve them.
--------------------------------------------------------------------------------
Connect with Sourabh and Up Train AI:
https://www.linkedin.com/in/sourabh-agrawal-62932b175/
https://twitter.com/SourabhAgr03
https://uptrain.ai/
Connect with Demetrios:
https://www.linkedin.com/in/dpbrinkm/
Follow Qdrant:
https://www.linkedin.com/company/qdrant/
https://twitter.com/qdrant_engine
Join the Qdrant Discord server:
https://discord.gg/W4PejyMMKu