Creating and using a Code Agent#

A Code Agents is software that interacts with its environment. Agents can be built and deployed for different use cases. Dataiku provides several default implementations. This tutorial is focused on the simple tool-calling agent implementation. The use case of this tutorial is the same as the one used in LLM Mesh agentic applications, Building and using an agent with Dataiku’s LLM Mesh and Langchain and How to create a custom tool for integration into a visual agent. The use case involves retrieving customer information based on a provided ID and fetching additional data about the customer’s company utilizing an internet search.

Prerequisites#

  • Dataiku >= 13.4

  • An OpenAI connection

  • Python >= 3.10

  • A code environment with the following packages:

    langchain_core    #tested with 0.2.2
    langchain         #tested with
    duckduckgo_search # tested with 7.5.3
    
  • An SQL Dataset named pro_customers_sql. You can create this file by uploading this CSV file.

Creating the Code Agent#

To create a Code Agent, go to the project’s Flow, click the Other button at the bottom right, select Generative AI, and click on Code Agent. Choose a meaningful name for the agent, then click the Ok button.

Fig. 1: Modal window for Code Agent creation.

Fig. 1: Modal window for Code Agent creation.#

In the modal window shown in Figure 1, choose the “Simple tool-calling agent” and click the Create button to enter a code environment, as shown in Figure 2.

Fig. 2: Code Agent -- Code environment.

Fig. 2: Code Agent – Code environment.#

This window contains four tabs:

  • Code: where you will code your agent.

  • Settings: where you will notably set up the code environment used by the agent.

  • Logs: where you will find the agent’s logs.

  • Quick test: where you can test your code agent.

Coding the Agent#

Before coding the agent, select the Settings tabs and set Code env to use your code environment, which was created for the tutorial. Then, return to the Code tabs and start coding your agent. Dataiku provides a code sample to help you be more productive when creating a Code Agent. Like Visual Agents, Code Agent relies on tools to answer your queries. There are three different ways to use a tool in a Code Agent:

  • Embedded into the Code Agent.

  • Using a function defined in the project library.

  • Using Custom Tools.

Creating the tools#

To create the tools (Get Customer Info and Get Company Info), replace the default code with the code provided in Code 1.

Code 1: Embedded tools#
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage, ToolMessage
import dataiku
from dataiku import SQLExecutor2
from duckduckgo_search import DDGS
from dataiku.langchain.dku_llm import DKUChatLLM
from dataiku.llm.python import BaseLLM

OPENAI_CONNECTION_NAME = "REPLACE_WITH_YOUR_CONNECTION_NAME" # example: "openAI"

@tool
def get_customer_details(customer_id: str) -> str:
    """Get customer name, position and company information from database.
    The input is a customer id (stored as a string).
    The ouput is a string of the form:
        "The customer's name is \"{name}\", holding the position \"{job}\" at the company named {company}"
    """
    dataset = dataiku.Dataset("pro_customers_sql")
    table_name = dataset.get_location_info().get('info', {}).get('table')
    executor = SQLExecutor2(dataset=dataset)
    customer_id = customer_id.replace("'", "\\'")
    query_reader = executor.query_to_iter(
        f"""SELECT name, job, company FROM "{table_name}" WHERE id = '{customer_id}'""")
    for (name, job, company) in query_reader.iter_tuples():
        return f"The customer's name is \"{name}\", holding the position \"{job}\" at the company named {company}"
    return f"No information can be found about the customer {customer_id}"

@tool
def search_company_info(company_name: str) -> str:
    """
    Use this tool when you need to retrieve information on a company.
    The input of this tool is the company name.
    The ouput is either a small recap of the company or "No information ..." meaning that we couldn't find information about this company
    """
    with DDGS() as ddgs:
        results = list(ddgs.text(f"{company_name} (company)", max_results=1))
        if results:
            return { "messages": [f"Information found about {company_name}: {results[0]['body']}"]}
        return f"No information found about {company_name}"
    
tools = [get_customer_details, search_company_info]


class MyLLM(BaseLLM):
    def __init__(self):
        pass

    def process(self, query, settings, trace):
        prompt = query["messages"][0]["content"]

        llm = DKUChatLLM(llm_id=f"openai:{OPENAI_CONNECTION_NAME}:gpt-4o-mini")
        #llm = DKUChatLLM(llm_id="bedrock:tgengler-bedrock:anthropic.claude-3-7-sonnet-20250219-v1:0")
        llm_with_tools = llm.bind_tools(tools)

        messages = [HumanMessage(prompt)]

        with trace.subspan("Invoke LLM with tools") as subspan:
            ai_msg = llm_with_tools.invoke(messages)

        tool_messages = []

        with trace.subspan("Call the tools") as tools_subspan:
            for tool_call in ai_msg.tool_calls:
                with trace.subspan("Call a tool") as tool_subspan:
                    tool_subspan.attributes["tool_name"] = tool_call["name"]
                    tool_subspan.attributes["tool_args"] = tool_call["args"]
                    if tool_call["name"] == "get_customer_details":
                        tool_output = get_customer_details(tool_call["args"])
                    else:
                        tool_output = search_company_info(tool_call["args"])
                tool_messages.append(ToolMessage(tool_call_id =tool_call["id"], content=tool_output))

        messages = [
            HumanMessage(prompt),
            ai_msg,
            *tool_messages
        ]

        with trace.subspan("Compute final answer") as subspan:
            final_resp = llm_with_tools.invoke(messages)
            return {"text": final_resp.content}
    

Testing the Code Agent#

No matter the way you define your Code Agent, you can test it in the Quick test tabs by entering the following test query:

{
  "messages": [
      {
        "role": "user",
        "content": "Give all the professional information you can about the customer with ID: tcook. Also, include information about the company if you can."
      }
  ],
  "context": {}
}

Wrapping up#

Congratulations! You now know how to create a Code Agent. To wrap things up, setting up and utilizing a Code Agent in Dataiku opens up opportunities to automate and improve how we interact with Dataiku and data.

You can test other functions, mix the different approaches, or create many other Code Agents that fit your needs.