RAG(E) Deployments
Retrieval Augmented Generation….& Evaluation !
Discover how the RAG and RAG(E) frameworks combine retrieval and generation for dynamic, accurate AI insights. Adapt without retraining, ensuring timely, informed responses from LLMs.
Building RAG applications
RAG is an AI framework for retrieving facts from an external knowledge base to ground large language models (LLMs) on the most accurate, up-to-date information and to give users insight into LLMs’ generative process.
RAG builds upon prompt engineering by supplementing prompts with information from external sources such as vector databases or APIs. This data is incorporated into the prompt before it is submitted to the LLM.
This makes RAG adaptive for situations where facts could evolve over time. This is very useful as LLMs’ parametric knowledge is static. RAG allows language models to bypass retraining, enabling access to the latest information for generating reliable outputs via retrieval-based generation.
RAG(e) - for better quality response
We deploy an Evaluator LLM to score the quality of the response, using the context. We can also have it produce scores for other dimensions such as hallucination (is the generated answer using information only from the provided context), toxicity, etc.
Open-source models perform really well on simple queries where the answer can be easily inferred from the retrieved context but they fall short for queries that involve reasoning, numbers or code examples.
To identify the appropriate LLM to use, we recommend to train a classifier that takes the query and routes it to the best LLM.
E = Evaluation !
It is critical to perform both unit/component and end-to-end evaluation which involve evaluating the retrieval in isolation (is the best source in any given set of retrieved chunks) and evaluating the LLM‘s response (given the best source, is the LLM able to produce a quality answer).
And for end-to-end evaluation, one can assess the quality of the entire system (given the data sources, what is the quality of the response).
Routing
- Building the most performant and cost-effective solution.
- Right LLM for right job – routing queries to the right LLM according to the complexity or topic of the query
Our solution accelerators
Get Started With AI Experts
Write to us to explore how LLM applications can be built for your business.
