RAG: Improving your Knowledge Bank retrieval#

Once you have created a Knowledge Bank and used it as the base of your RAG, for example, after following the Programmatic RAG with Dataiku’s LLM Mesh tutorial, you can improve the quality of your retrieval with an additional pre-retrieval step.

Prerequisites#

  • Dataiku >= 13.4

  • An OpenAI connection

  • Python >= 3.9

  • A code environment with the following packages:

    langchain           #tested with 0.3.13
    langchain-chroma    #tested with 0.1.4
    langchain-community #tested with 0.3.13
    langchain-core      #tested with 0.3.63
    

Additionally, we will start with the RAG developed during the Programmatic RAG with Dataiku’s LLM Mesh tutorial. As described in this previous tutorial, you will need the corresponding prerequisites.

Introduction#

When a user prompt is processed by a RAG, the workflow usually involves first querying the vector store and using the result to enrich the context of the LLM query. This tutorial will explain how to improve the final answer by enhancing the initial prompt before retrieval. This additional step will clarify, rephrase, and expand the original prompt to match the context. This step will result in a broader set of relevant documents retrieved, improving the global answer’s precision.

Starter code for your RAG#

The code below is a starter code for your RAG. You define access to the Knowledge Bank with the embedded documents, and then use the corresponding vector store to run an enriched LLM query.

initial_rag.py
Code 1: Starter code for your RAG#
import dataiku
from dataiku.langchain.dku_llm import DKUChatLLM
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

LLM_ID = "<fill with your LLM Id>"
KB_ID = "fill with your Knowledge Bank Id"

# Retrieve the vectore store through the Knowledge Bank
client = dataiku.api_client()
project = client.get_default_project()
kb = dataiku.KnowledgeBank(id=KB_ID, project_key=project.project_key)
vector_store = kb.as_langchain_vectorstore()

# Create the LLM access
dkullm = DKUChatLLM(llm_id=LLM_ID, temperature=0)
system_prompt = """Always state when an answer is unknown. Do not guess or fabricate a response.
    {context}"""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)
# Create the chain that will combine documents in the context with the prompt
question_answer_chain = create_stuff_documents_chain(dkullm, prompt)

# an example user query
user_query = "What will inflation in Europe look like and why?"

# First, perform a similarity search with the vector store
search_results = vector_store.similarity_search(user_query, k=10)

# Run the enriched query
resp = question_answer_chain.invoke({"context": search_results, "input": user_query})
print(resp)

Rewriting the query#

If you want to improve the answer from your RAG system, you can first improve the original query. The goal is to design the system prompt to guide the query’s rewriting process, clarifying and expanding what was originally input. The following code shows how to add this rewriting.

Code 2: Add a query rewriting before the RAG query#
import dataiku
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

LLM_ID = "<fill with your LLM Id>"
KB_ID = "fill with your Knowledge Bank Id"

# Retrieve the vectore store through the Knowledge Bank
client = dataiku.api_client()
project = client.get_default_project()
kb = dataiku.KnowledgeBank(id=KB_ID, project_key=project.project_key)
vector_store = kb.as_langchain_vectorstore()

# Get access to your LLM
llm = project.get_llm(LLM_ID)

# define the system prompt to guide the rewriting process of the query
improve_system = """You are a helpful assistant improving search quality. 
Rewrite the following query to make it more specific, detailed, and clear for a document search system."""

# an example user query
user_query = "What will inflation in Europe look like and why?"

# Query your LLM to obtain a rephrased query
improve = llm.new_completion()
improve.settings["temperature"] = 0.1
improve.with_message(message=improve_system, role="system")
improve.with_message(message=user_query, role="user")
resp = improve.execute()
improved_query = resp.text

print(f"Original query is:\n {user_query}")
print(f"Improved query is:\n {improved_query}")

# Create the LLM access
dkullm = DKUChatLLM(llm_id=LLM_ID, temperature=0)
system_prompt = """Always state when an answer is unknown. Do not guess or fabricate a response.
    {context}"""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)
# Create the chain that will combine documents in the context with the prompt
question_answer_chain = create_stuff_documents_chain(dkullm, prompt)

# First, perform a similarity search with the vector store
search_results = vector_store.similarity_search(improved_query, k=10)

# Run the enriched query
resp = question_answer_chain.invoke(
    {"context": search_results, "input": improved_query}
)
print(resp)

For example, an original query like inflation in Europe may be improved by asking What are the projected inflation trends in Europe for the next year, and what are the key factors influencing these trends? You have to tailor the system prompt to rewrite the query according to the context of your project.

Wrapping up#

Congratulations! You are now able to improve the results coming from your RAG. Depending on the context of your usage, you may also use techniques like semantic enrichment or multi-query expansion.