Using standard OpenAI API calls via the LLM Mesh#

An OpenAI-compatible API Python client is available for text completion requests through the LLM Mesh. The LLM Mesh provides a governed way to access multiple providers. Instead of handling separate API keys and endpoints for each provider, you can use the LLM Mesh to:

  • Access multiple models using OpenAI’s standard Python format …

  • while maintaining centralized governance, monitoring and cost control …

  • and easily switching between different LLM providers

Prerequisites#

Before starting, ensure you have:

  • Dataiku >= 13.2

  • A valid DSS API key

  • Project permissions for “Read project content” and “Write project content”

  • An existing OpenAI LLM Mesh connection

  • Python environment with the openai package installed (tested with version 1.3.0)

OpenAI client for the LLM Mesh#

Set up the OpenAI client by pointing to its LLM Mesh configuration. You will need several pieces of information for access and authentication:

  • A public DSS URL to access the LLM Mesh API

  • An API key for DSS

  • The LLM ID

from openai import OpenAI

# Specify the DSS OpenAI-compatible public API URL, e.g. http://my.dss/public/api/projects/PROJECT_KEY/llms/openai/v1/
BASE_URL = ""

# Use your DSS API key instead of an OpenAI secret
API_KEY = ""

# Fill with your LLM id - to get the list of LLM ids, you can use dataiku.api_client().project.list_llms()
LLM_ID = "" 

# Initialize the OpenAI client
client = OpenAI(
    base_url=BASE_URL,
    api_key=API_KEY
)

# Default parameters
DEFAULT_TEMPERATURE = 0
DEFAULT_MAX_TOKENS = 500

Tip

In case you need find the LLM ID, there’s a standard way to look up all available LLM Mesh configured APIs using the dataiku client. Use the project.list_llms() method and note down the OpenAI model you want to use. It will look something like openai:CONNECTION-NAME:MODEL-NAME.

Making requests to OpenAI via LLM Mesh#

Now you can make requests to the LLM just like you would with the standard OpenAI API:


# Create a prompt

context = '''You are a capable ghost writer 
  who helps college applicants'''

content = '''Write a complete 350-word short essay 
  for a college application on the topic - 
  My first memories.'''


prompt = [
    {"role": "system", 
      "content": context}, 
    {'role': 'user',
      'content': content}
]

# Send the request
try:
    response = client.chat.completions.create(
        model=LLM_ID,
        messages=prompt,
        temperature=DEFAULT_TEMPERATURE,
        max_tokens=DEFAULT_MAX_TOKENS
    )
    
    print(response.choices[0].message.content)
except Exception as e:
    print(f"Error making request: {e}")

Wrapping up#

Now that you have the basic setup working, you can:

  • Experiment with different prompts and parameters - here’s what is available

  • Use other LLM providers available through the LLM Mesh

  • Try chunking longer responses by using the stream parameter as shown in Code 1

streaming.py
Code 1 – Longer code block with streaming example#
print("📚 .. imports ... ")
print("🤖 .. Python client for OpenAI API calls ...")
from openai import OpenAI

print("⏱ .. library for timing ...")
import time
import httpx # in case of self-signed certificates
print("\n\n")

# Specify the DSS OpenAI-compatible public API URL, e.g. http://my.dss/public/api/projects/PROJECT_KEY/llms/openai/v1/
BASE_URL = ""

# Use your DSS API key instead of an OpenAI secret
API_KEY = ""

# Fill with your LLM id - to get the list of LLM ids, you can use dataiku.api_client().project.list_llms()
LLM_ID = "" 

# Create an OpenAI client
open_client = OpenAI(
  base_url=BASE_URL,
  api_key=API_KEY,
  http_client=httpx.Client(verify=False)  # in case of self-signed certificates
)

print("🔑 .. client created, key set ...")


DEFAULT_TEMPERATURE = 0
DEFAULT_MAX_TOKENS = 1000

print("\n\n")

context = '''You are a capable ghost writer 
  who helps college applicants'''

content = '''Write a complete 500-word short essay 
  for a college application on the topic - 
  My first memories.'''

prompt = [
    {"role": "system", 
     "content": context}, 
    {'role': 'user',
     'content': content}
]


print(f"This is the prompt: {content}")
print("\n\n")  

print("⏲ .. Record the time before the request is sent ..")
start_time = time.time()

print("📤 .. Send a ChatCompletion request ...")
response = open_client.chat.completions.create(
    model=LLM_ID,
    stream=True,
    messages=prompt,
    temperature=DEFAULT_TEMPERATURE,
    max_tokens=DEFAULT_MAX_TOKENS
)


collected_chunks = []
collected_messages = []

# iterate through the stream of events
for chunk in response:
    chunk_time = time.time() - start_time  # calculate the time delay of the chunk
    collected_chunks.append(chunk)  # save the event response
    chunk_message = chunk.choices[0].delta  # extract the message
    collected_messages.append(chunk_message)  # save the message
    if hasattr(chunk_message, 'content'):
        print(chunk_message.content, end="")

print("\n\n\n")  

# print the time delay and text received
print(f"Full response received {chunk_time:.2f} seconds after request")
full_reply_content = ''.join([m.content for m in collected_messages if hasattr(m, 'content') and m.content is not None])

Remember that all requests go through the LLM Mesh, which provides monitoring and governing capabilities while maintaining the familiar OpenAI API interface.