Using OpenAI-compatible API calls via the LLM Mesh#
An OpenAI-compatible Python client is available for the LLM Mesh. You can use it to send both Chat Completions API and Responses API requests through the same governed endpoint. Instead of handling separate API keys and endpoints for each provider, you can use the LLM Mesh to:
Access multiple models using OpenAI’s standard Python format …
while maintaining centralized governance, monitoring and cost control …
and easily switching between different LLM providers
Prerequisites#
Before starting, ensure you have:
Dataiku >= 13.2 for Chat Completions API examples
Dataiku >= 14.4.3 for Responses API examples
A valid Dataiku API key
Project permissions for “Read project content” and “Write project content”
An existing OpenAI LLM Mesh connection
Python environment with the
openaipackage installed
OpenAI client for the LLM Mesh#
Set up the OpenAI client by pointing to its LLM Mesh configuration. You will need several pieces of information for access and authentication:
A public Dataiku URL to access the LLM Mesh API
An API key for Dataiku
The LLM ID
from openai import OpenAI
# Specify the Dataiku OpenAI-compatible public API URL, e.g. http://my.dss/public/api/projects/PROJECT_KEY/llms/openai/v1/
BASE_URL = ""
# Use your Dataiku API key instead of an OpenAI secret
API_KEY = ""
# Fill with your LLM id - to get the list of LLM ids, you can use dataiku.api_client().project.list_llms()
LLM_ID = ""
# Initialize the OpenAI client
client = OpenAI(
base_url=BASE_URL,
api_key=API_KEY
)
# Default parameters
DEFAULT_TEMPERATURE = 0
DEFAULT_MAX_TOKENS = 500
Tip
In case you need to find the LLM ID, there’s a standard way to look up all available LLM Mesh configured APIs using the dataiku client. Use the project.list_llms() method and note down the OpenAI model you want to use. It will look something like openai:CONNECTION-NAME:MODEL-NAME.
Choosing an API surface#
The LLM Mesh exposes both OpenAI-compatible endpoints from the same base URL:
client.chat.completions.create(...)for the classic chat-completions formatclient.responses.create(...)for OpenAI’s Responses API
Use the Chat Completions API when you want the familiar messages=[...] format. Use the Responses API when you want the newer input format, typed content items, or event-based streaming. The OpenAI client automatically targets the matching LLM Mesh endpoint for the method you call.
Making requests to OpenAI via LLM Mesh#
Now you can make requests to the LLM just like you would with the standard OpenAI API:
# Create a prompt
context = '''You are a capable ghost writer
who helps college applicants'''
content = '''Write a complete 350-word short essay
for a college application on the topic -
My first memories.'''
prompt = [
{"role": "system",
"content": context},
{'role': 'user',
'content': content}
]
# Send the request
try:
response = client.chat.completions.create(
model=LLM_ID,
messages=prompt,
temperature=DEFAULT_TEMPERATURE,
max_tokens=DEFAULT_MAX_TOKENS
)
print(response.choices[0].message.content)
except Exception as e:
print(f"Error making request: {e}")
context = '''You are a capable ghost writer
who helps college applicants'''
content = '''Write a complete 350-word short essay
for a college application on the topic -
My first memories.'''
try:
response = client.responses.create(
model=LLM_ID,
instructions=context,
input=content,
max_output_tokens=DEFAULT_MAX_TOKENS
)
print(response.output_text)
except Exception as e:
print(f"Error making request: {e}")
Note
The Responses API uses input instead of messages, returns generated text in response.output_text, and streams typed events instead of chat-completion delta chunks.
Using typed input with the Responses API#
For simple prompts, a string input is enough. For multimodal inputs or multi-turn conversations, pass a list of typed items instead:
typed_input = [
{
"role": "user",
"content": [
{
"type": "input_text",
"text": "Summarize the three most important points about governed LLM access."
}
]
}
]
response = client.responses.create(
model=LLM_ID,
input=typed_input,
max_output_tokens=DEFAULT_MAX_TOKENS
)
print(response.output_text)
On follow-up, you can also pass prior response.output items back into the next input list, together with any function_call_output items you generate locally.
Wrapping up#
Now that you have the basic setup working, you can:
Experiment with both
client.chat.completions.create(...)andclient.responses.create(...)Use typed
inputitems with the Responses API for multimodal prompts or tool-calling loopsTry structured outputs with
client.responses.parse(...)when your model and provider support themUse other LLM providers available through the LLM Mesh
Learn more about the OpenAI-compatible setup in the LLM Mesh concept page
Streaming#
The Chat Completions API and the Responses API both support streaming through the same LLM Mesh endpoint, but the event shape differs:
The Chat Completions API streams delta chunks
The Responses API streams typed events such as
response.created,response.output_text.delta, andresponse.completed
chat_completions_streaming.py
print("📚 .. imports ... ")
print("🤖 .. Python client for OpenAI API calls ...")
from openai import OpenAI
print("⏱ .. library for timing ...")
import time
import httpx # in case of self-signed certificates
print("\n\n")
# Specify the Dataiku OpenAI-compatible public API URL, e.g. http://my.dss/public/api/projects/PROJECT_KEY/llms/openai/v1/
BASE_URL = ""
# Use your Dataiku API key instead of an OpenAI secret
API_KEY = ""
# Fill with your LLM id - to get the list of LLM ids, you can use dataiku.api_client().project.list_llms()
LLM_ID = ""
# Create an OpenAI client
open_client = OpenAI(
base_url=BASE_URL,
api_key=API_KEY,
http_client=httpx.Client(verify=False) # in case of self-signed certificates
)
print("🔑 .. client created, key set ...")
DEFAULT_TEMPERATURE = 0
DEFAULT_MAX_TOKENS = 1000
print("\n\n")
context = '''You are a capable ghost writer
who helps college applicants'''
content = '''Write a complete 500-word short essay
for a college application on the topic -
My first memories.'''
prompt = [
{"role": "system",
"content": context},
{'role': 'user',
'content': content}
]
print(f"This is the prompt: {content}")
print("\n\n")
print("⏲ .. Record the time before the request is sent ..")
start_time = time.time()
print("📤 .. Send a ChatCompletion request ...")
response = open_client.chat.completions.create(
model=LLM_ID,
stream=True,
messages=prompt,
temperature=DEFAULT_TEMPERATURE,
max_tokens=DEFAULT_MAX_TOKENS
)
collected_chunks = []
collected_messages = []
# iterate through the stream of events
for chunk in response:
chunk_time = time.time() - start_time # calculate the time delay of the chunk
collected_chunks.append(chunk) # save the event response
chunk_message = chunk.choices[0].delta # extract the message
collected_messages.append(chunk_message) # save the message
if hasattr(chunk_message, 'content'):
print(chunk_message.content, end="")
print("\n\n\n")
# print the time delay and text received
print(f"Full response received {chunk_time:.2f} seconds after request")
full_reply_content = ''.join([m.content for m in collected_messages if hasattr(m, 'content') and m.content is not None])
responses_streaming.py
from openai import OpenAI
import httpx # Optional: useful for self-signed certificates
# Specify the Dataiku OpenAI-compatible public API URL, e.g. http://my.dss/public/api/projects/PROJECT_KEY/llms/openai/v1/
BASE_URL = ""
# Use your Dataiku API key instead of an OpenAI secret
API_KEY = ""
# Fill with your LLM id - to get the list of LLM ids, you can use dataiku.api_client().project.list_llms()
LLM_ID = ""
client = OpenAI(
base_url=BASE_URL,
api_key=API_KEY,
http_client=httpx.Client(verify=False), # Optional: for self-signed certificates
)
stream = client.responses.create(
model=LLM_ID,
input="Write a short poem about governed AI platforms.",
stream=True,
)
collected_text = []
for event in stream:
if event.type == "response.output_text.delta":
print(event.delta, end="")
collected_text.append(event.delta)
elif event.type == "response.completed":
print("\n")
full_text = "".join(collected_text)
print(full_text)
Remember that all requests go through the LLM Mesh, which provides monitoring and governing capabilities while maintaining the familiar OpenAI API interface.
