The New RAG Method in Claude 3.5: Contextual Retrieval
Improve AI accuracy with Contextual Retrieval in Claude 3.5. Learn how RAG enhances model knowledge and boosts retrieval performance by 67%.
To make AI models work effectively in specific environments, they often need to access background knowledge.
For example, a customer support chatbot must understand the business it serves, while a legal analysis bot needs to know past cases.
Developers usually use Retrieval-Augmented Generation (RAG) to improve an AI model's knowledge.
RAG retrieves relevant information from a knowledge base and adds it to the user’s prompt, enhancing the model’s response significantly.
The issue is that traditional RAG solutions often lose context when encoding information, causing the system to miss relevant data.
This article outlines a method that greatly improves RAG’s retrieval step, called "Contextual Retrieval," which uses two sub-techniques: contextual embeddings and contextual BM25.
This method reduces failed retrievals by 49%, and with re-ranking, by 67%. This leads to significant improvements in retrieval accuracy, translating into better downstream task performance.