Does Fine Tuning Embedding Models Improve RAG?

Does Fine Tuning Embedding Models Improve RAG?

9.779 Lượt nghe
Does Fine Tuning Embedding Models Improve RAG?
Can fine tuning embedding models improve your RAG application? Yes! And it doesn’t even have to be that complicated. In this video we show how to train a query only linear adapter on your own RAG data to improve your document retrieval accuracy- a lightweight approach that can be applied to any embedding model without needing to fully fine tune the model itself, OR re-embed your knowledgebase. Resources: GitHub Repo - https://github.com/ALucek/linear-adapter-embedding Trained Adapters - https://huggingface.co/AdamLucek/all-MiniLM-L6-v2-query-only-linear-adapter-AppleQA Dataset - https://huggingface.co/datasets/AdamLucek/apple-environmental-report-QA-retrieval ChromaDB Research - https://research.trychroma.com/embedding-adapters Efficient Domain Adaptation of Sentence Embeddings Using Adapters - https://arxiv.org/pdf/2307.03104 Improving Text Embeddings with Large Language Models - https://arxiv.org/pdf/2401.00368 Chapters: 00:00 - Introduction 00:39 - What is an Embedding Adapter? 03:04 - Defining our RAG Application 04:30 - Creating a Synthetic Dataset 09:03 - Setting Up Vector Database 11:23 - Evaluating our Model Baseline 14:16 - Training: Context 14:40 - Training: Triplet Margin Loss 16:01 - Training: Random Negative Sampling 17:01 - Training: Linear Layer Explanation 18:59 - Training: Triplet Data Loader 19:44 - Training: Training Script 20:17 - Training: Execution & Hyperparameters 21:22 - Assessment: New Embedding Function 22:04 - Assessment: Evaluating the Adapter 22:40 - Assessment: Metric Interpretation 23:28 - Assessment: Visualization 24:09 - Assessment: Training Data Fitting 25:35 - Closing Thoughts #ai #datascience #programming