Creating an Inline Python Tool#
You can elaborate your code in an Inline Python Tool before creating your Code Agent or Custom Tool. An Inline Python Tool will allow you to keep a simple development workflow and inline your Python code. The tool produced will be available in your project, and you will have straightforward steps to create a shared tool like a Custom Tool or a Code Agent.
Prerequisites#
Dataiku >= 14.0
An OpenAI connection (or an equivalent LLM Mesh connection)
Python >= 3.10
A code environment with the following packages:
langchain_core #tested with 0.3.60 langchain #tested with 0.3.25 duckduckgo_search #tested with 8.0.2
An SQL Dataset named
pro_customers_sql
. You can create this dataset by uploading thisCSV file
and using a Sync recipe to store the data in an SQL connection.
Creating the Inline Python Tool#
To create an Inline Python Tool, go to the GenAI menu, select Agent Tools, and click the New Agent Tool button. Then select Code Tool, give it a meaningful name, such as Get Company Info, and click the Create button.

The window you reach contains three tabs:
Code: where you will code your tool.
Settings: where you will notably set up the code environment the tool uses.
Quick test: where you can test your Inline Python Tool.
Coding the Tool#
Note
The code used in this tutorial is the same as the one used to define the first tool in the Custom Tool tutorial. When creating your Inline Python Tool, you are building the code that may be used in other tools and agents.
In the Code tab, you can write the code of your specific tool. You can start from one of the templates available by clicking the Use code template button. For this tutorial, we will code a tool to Get Customer Info. Replace the default code with the code provided in Code 1.
from dataiku.llm.agent_tools import BaseAgentTool
import logging
import dataiku
from dataiku import SQLExecutor2
from dataiku.sql import Constant, toSQL, Dialects
class DatasetLookupTool(BaseAgentTool):
def set_config(self, config, plugin_config):
self.logger = logging.getLogger(__name__)
self.config = config
self.plugin_config = plugin_config
def get_descriptor(self, tool):
return {
"description": """Provide a name, job title and company of a customer, given the customer's ID""",
"inputSchema": {
"title": "Input for a customer id",
"type": "object",
"properties": {
"id": {
"type": "string",
"description": "The customer Id"
}
}
}
}
def invoke(self, input, trace):
self.logger.setLevel(logging.DEBUG)
self.logger.debug(input)
args = input["input"]
customerId = args["id"]
dataset = dataiku.Dataset("pro_customers_sql")
table_name = dataset.get_location_info().get('info', {}).get('quotedResolvedTableName')
executor = SQLExecutor2(dataset=dataset)
cid = Constant(str(customerId))
escaped_cid = toSQL(cid, dialect=Dialects.POSTGRES) # Replace by your DB
query_reader = executor.query_to_iter(
f"""SELECT "name", "job", "company" FROM {table_name} WHERE "id" = {escaped_cid}""")
for (name, job, company) in query_reader.iter_tuples():
return {
"output": f"""The customer's name is "{name}", holding the position "{job}" at the company named "{company}"."""}
return {"output": f"No information can be found about the customer {customerId}"}
Testing the Code Agent#
Once you have your code, you can test it in the Quick test tabs by entering the following test query:
{
"input": {
"id": "tcook"
},
"context": {}
}
Call your Inline Python Tool from your application#
After creating your agent, you can utilize it in any context where an LLM is applicable. It is a process in two steps:
First, you need to get the identifier of your Inline Python Tool.
To list all Inline Python Tools that have been defined in a project,
you can use the list_agent_tools()
and search for your tool.
import dataiku
client = dataiku.api_client()
project = client.get_default_project()
project.list_agent_tools()
Running this code snippet will provide a list of all tools defined in the project. You should see your tool in this list:
[{'id': 'xg3bQfN',
'type': 'InlinePython',
'name': 'inline-tool'},
{'id': 'REDaiQN',
'type': 'Custom_agent_tool_toolbox_internet-search',
'name': 'Get Company Info'},
{'id': 'SOy7zKq',
'type': 'Custom_agent_tool_toolbox_dataset-lookup',
'name': 'Get Customer Info'}]
Once you know the tool’s ID, you can use it to call the tool, as shown in the code below:
import dataiku
client = dataiku.api_client()
project = client.get_default_project()
tool = project.get_agent_tool('xg3bQfN')
result = tool.run({"id": "fdouetteau"})
print(result['output'])
The customer's name is "Florian Douetteau", holding the position "CEO" at the company named "Dataiku".
Using this type of code, you can call any of your Inline Python Tools at the right place in your application’s code.
Call a Headless API from an Inline Python Tool#
After following the Headless API tutorial, you will have an application that can query an LLM from an API. An Inline Python Tool can use a headless API webapp to implement a query.
As we did for the first tool, to create a new Inline Python Tool, go to the GenAI menu, select Agent Tools, and click the New Agent Tool button. Then select Code Tool, give it a meaningful name, such as API tool, and click the Create button.
To call your API, you need to get the identifier of your headless API webapp.
You can use the list_webapps()
to search for your webapp.
import dataiku
client = dataiku.api_client()
project = client.get_default_project()
for webapp in project.list_webapps():
print(f"WebApp id: {webapp['id']} name: {webapp['name']}")
Running this code snippet will provide a list of all webapp defined in the project. You should see your headless API webapp in this list:
WebApp id: aRdCgN0 name: std
WebApp id: cpxmkji name: headless api
WebApp id: fzUJ5Bw name: llm based
WebApp id: iNFGFHN name: Uploading
Note the id; in the Code tab, you can write the code provided in Code 5.
import dataiku
from dataiku.llm.agent_tools import BaseAgentTool
import logging
class ApiAgentTool(BaseAgentTool):
"""A code-based agent tool that queries a headless API"""
def set_config(self, config, plugin_config):
self.logger = logging.getLogger(__name__)
def get_descriptor(self, tool):
"""
Returns the descriptor of the tool, as a dict containing:
- description (str)
- inputSchema (dict, a JSON Schema representation)
"""
return {
"description": "Provide a prompt to use to query the headless API",
"inputSchema" : {
"$id": "",
"title": "",
"type": "object",
"properties": {
"prompt": {
"type": "string",
"description": "The prompt to use in the API call"
}
}
}
}
def invoke(self, input, trace):
"""
Invokes the tool.
The arguments of the tool invocation are in input["input"], a dict
"""
self.logger.setLevel(logging.DEBUG)
self.logger.debug(input)
WEBAPP_ID = "cpxmkji"
client = dataiku.api_client()
project = client.get_default_project()
webapp = project.get_webapp(WEBAPP_ID)
backend = webapp.get_backend_client()
backend.session.headers['Content-Type'] = 'application/json'
prompt = input['input']['prompt']
# Query the LLM
response = backend.session.post(backend.base_url + '/query', json={'message':prompt})
if response.ok:
return { "output": response.text }
else:
return { "output": f"An error occured: {response.status_code} {response.reason}" }
Your Query API agent is now available. You can follow the same steps as for the first tool. Identify the tool with Code 3. The tool list will look like this:
[{'id': 'xg3bQfN', 'type': 'InlinePython', 'name': 'inline-tool'},
{'id': 'REDaiQN', 'type': 'Custom_agent_tool_toolbox_internet-search', 'name': 'Get Company Info'},
{'id': 'SOy7zKq', 'type': 'Custom_agent_tool_toolbox_dataset-lookup', 'name': 'Get Customer Info'},
{'id': 'rmESZYL', 'type': 'InlinePython', 'name': 'api_tool'}]
As previously, you can now call your Query API tool with Code 6.
import dataiku
client = dataiku.api_client()
project = client.get_default_project()
tool = project.get_agent_tool('rmESZYL')
result = tool.run({"prompt": "Do you know Dataiku?"})
print(result['output'])
Yes, I am familiar with Dataiku. Dataiku is a data science and machine learning platform designed to help enterprises build and manage their data projects more efficiently.
It provides tools for data preparation, analytics, machine learning, and deployment in a collaborative environment.
Users can work with data using both a code-free, visual interface and through coding in languages like Python, R, and SQL. Dataiku is designed to empower data scientists, engineers, and analysts to work together on data-driven projects and make the process of developing and deploying models more streamlined and scalable.
Wrapping up#
Congratulations! You now know how to create an Inline Python Tool. You can use this to define and test your tool efficiently. Among the possible next steps, you can implement your current code in a Code Agent . This will allow you to broaden the scope of your code’s usage and make it available to other projects.
Reference documentation#
Classes#
|
Provides a handle to obtain readers and writers on a dataiku Dataset. |
|
This is a handle to execute SQL statements on a given SQL connection. |
|
The base interface for a code-based agent tool |
|
Entry point for the DSS API client |
|
A handle to interact with a project on the DSS instance. |
|
A handle for the webapp. |
A client to interact by API with a standard webapp backend |
Functions#
|
Obtain an API client to request the API of this DSS instance |
|
Get a handle to interact with a specific tool |
Get a handle to the current default project, if available (i.e. |
|
|
Retrieve the location information of the dataset. |
|
Get a handle to interact with a specific webapp |
|
|
|
List the webapp heads of this project |
|
This function returns a |