LLM Mesh#

For usage information and examples, please see LLM Mesh

class dataikuapi.dss.llm.DSSLLM(client, project_key, llm_id)#

A handle to interact with a DSS-managed LLM.

Important

Do not create this class directly, use dataikuapi.dss.project.DSSProject.get_llm() instead.

new_completion()#

Create a new completion query.

Returns:

A handle on the generated completion query.

Return type:

DSSLLMCompletionQuery

new_completions()#

Create a new multi-completion query.

Returns:

A handle on the generated multi-completion query.

Return type:

DSSLLMCompletionsQuery

new_embeddings(text_overflow_mode='FAIL')#

Create a new embedding query.

Parameters:

text_overflow_mode (str) – How to handle longer texts than what the model supports. Either ‘TRUNCATE’ or ‘FAIL’.

Returns:

A handle on the generated embeddings query.

Return type:

DSSLLMEmbeddingsQuery

new_images_generation()#
as_langchain_llm(**data)#

Create a langchain-compatible LLM object for this LLM.

Returns:

A langchain-compatible LLM object.

Return type:

dataikuapi.dss.langchain.llm.DKULLM

as_langchain_chat_model(**data)#

Create a langchain-compatible chat LLM object for this LLM.

Returns:

A langchain-compatible LLM object.

Return type:

dataikuapi.dss.langchain.llm.DKUChatModel

as_langchain_embeddings(**data)#

Create a langchain-compatible embeddings object for this LLM.

Returns:

A langchain-compatible embeddings object.

Return type:

dataikuapi.dss.langchain.embeddings.DKUEmbeddings

class dataikuapi.dss.llm.DSSLLMListItem(client, project_key, data)#

An item in a list of llms

Important

Do not instantiate this class directly, instead use dataikuapi.dss.project.DSSProject.list_llms().

to_llm()#

Convert the current item.

Returns:

A handle for the llm.

Return type:

dataikuapi.dss.llm.DSSLLM

property id#
Returns:

The id of the llm.

Return type:

string

property type#
Returns:

The type of the LLM

Return type:

string

property description#
Returns:

The description of the LLM

Return type:

string

class dataikuapi.dss.llm.DSSLLMCompletionsQuery(llm)#

A handle to interact with a multi-completion query. Completion queries allow you to send a prompt to a DSS-managed LLM and retrieve its response.

Important

Do not create this class directly, use dataikuapi.dss.llm.DSSLLM.new_completion() instead.

property settings#
Returns:

The completion query settings.

Return type:

dict

new_completion()#
execute()#

Run the completions query and retrieve the LLM response.

Returns:

The LLM response.

Return type:

DSSLLMCompletionsResponse

with_json_output(schema=None, strict=None, compatible=None)#

Request the model to generate a valid JSON response, for models that support it.

Note that some models may require you to also explicitly request this in the user or system prompt to use this.

Caution

JSON output support is experimental for locally-running Hugging Face models.

Parameters:
  • schema (dict) – (optional) If specified, request the model to produce a JSON response that adheres to the provided schema. Support varies across models/providers.

  • strict (bool) – (optional) If a schema is provided, whether to strictly enforce it. Support varies across models/providers.

  • compatible (bool) – (optional) Allow DSS to modify the schema in order to increase compatibility, depending on known limitations of the model/provider. Defaults to automatic.

with_structured_output(model_type, strict=None, compatible=None)#

Instruct the model to generate a response as an instance of a specified Pydantic model.

This functionality depends on with_json_output and necessitates that the model supports JSON output with a schema.

Caution

Structured output support is experimental for locally-running Hugging Face models.

Parameters:
  • model_type (pydantic.BaseModel) – A Pydantic model class used for structuring the response.

  • strict (bool) – (optional) see with_json_output()

  • compatible (bool) – (optional) see with_json_output()

class dataikuapi.dss.llm.DSSLLMCompletionsQuerySingleQuery#
new_multipart_message(role='user')#

Start adding a multipart-message to the completion query.

Use this to add image parts to the message.

Parameters:

role (str) – The message role. Use system to set the LLM behavior, assistant to store predefined responses, user to provide requests or comments for the LLM to answer to. Defaults to user.

Return type:

DSSLLMCompletionQueryMultipartMessage

with_message(message, role='user')#

Add a message to the completion query.

Parameters:
  • message (str) – The message text.

  • role (str) – The message role. Use system to set the LLM behavior, assistant to store predefined responses, user to provide requests or comments for the LLM to answer to. Defaults to user.

with_tool_calls(tool_calls, role='assistant')#

Add tool calls to the completion query.

Parameters:
  • tool_calls (list[dict]) – Calls to tools that the LLM requested to use.

  • role (str) – The message role. Defaults to assistant.

with_tool_output(tool_output, tool_call_id, role='tool')#

Add a tool message to the completion query.

Parameters:
  • tool_output (str) – The tool output, as a string.

  • tool_call_id (str) – The tool call id, as provided by the LLM in the conversation messages.

  • role (str) – The message role. Defaults to tool.

class dataikuapi.dss.llm.DSSLLMCompletionsResponse(raw_resp, response_parser=None)#

A handle to interact with a multi-completion response.

Important

Do not create this class directly, use dataikuapi.dss.llm.DSSLLMCompletionsQuery.execute() instead.

property responses#

The array of responses

class dataikuapi.dss.llm.DSSLLMCompletionQuery(llm)#

A handle to interact with a completion query. Completion queries allow you to send a prompt to a DSS-managed LLM and retrieve its response.

Important

Do not create this class directly, use dataikuapi.dss.llm.DSSLLM.new_completion() instead.

property settings#
Returns:

The completion query settings.

Return type:

dict

execute()#

Run the completion query and retrieve the LLM response.

Returns:

The LLM response.

Return type:

DSSLLMCompletionResponse

execute_streamed()#

Run the completion query and retrieve the LLM response as streamed chunks.

Returns:

An iterator over the LLM response chunks

Return type:

Iterator[Union[DSSLLMStreamedCompletionChunk, DSSLLMStreamedCompletionFooter]]

new_multipart_message(role='user')#

Start adding a multipart-message to the completion query.

Use this to add image parts to the message.

Parameters:

role (str) – The message role. Use system to set the LLM behavior, assistant to store predefined responses, user to provide requests or comments for the LLM to answer to. Defaults to user.

Return type:

DSSLLMCompletionQueryMultipartMessage

with_json_output(schema=None, strict=None, compatible=None)#

Request the model to generate a valid JSON response, for models that support it.

Note that some models may require you to also explicitly request this in the user or system prompt to use this.

Caution

JSON output support is experimental for locally-running Hugging Face models.

Parameters:
  • schema (dict) – (optional) If specified, request the model to produce a JSON response that adheres to the provided schema. Support varies across models/providers.

  • strict (bool) – (optional) If a schema is provided, whether to strictly enforce it. Support varies across models/providers.

  • compatible (bool) – (optional) Allow DSS to modify the schema in order to increase compatibility, depending on known limitations of the model/provider. Defaults to automatic.

with_message(message, role='user')#

Add a message to the completion query.

Parameters:
  • message (str) – The message text.

  • role (str) – The message role. Use system to set the LLM behavior, assistant to store predefined responses, user to provide requests or comments for the LLM to answer to. Defaults to user.

with_structured_output(model_type, strict=None, compatible=None)#

Instruct the model to generate a response as an instance of a specified Pydantic model.

This functionality depends on with_json_output and necessitates that the model supports JSON output with a schema.

Caution

Structured output support is experimental for locally-running Hugging Face models.

Parameters:
  • model_type (pydantic.BaseModel) – A Pydantic model class used for structuring the response.

  • strict (bool) – (optional) see with_json_output()

  • compatible (bool) – (optional) see with_json_output()

with_tool_calls(tool_calls, role='assistant')#

Add tool calls to the completion query.

Parameters:
  • tool_calls (list[dict]) – Calls to tools that the LLM requested to use.

  • role (str) – The message role. Defaults to assistant.

with_tool_output(tool_output, tool_call_id, role='tool')#

Add a tool message to the completion query.

Parameters:
  • tool_output (str) – The tool output, as a string.

  • tool_call_id (str) – The tool call id, as provided by the LLM in the conversation messages.

  • role (str) – The message role. Defaults to tool.

class dataikuapi.dss.llm.DSSLLMCompletionQueryMultipartMessage(q, role)#
with_text(text)#

Add a text part to the multipart message

with_inline_image(image, mime_type=None)#

Add an image part to the multipart message

Parameters:
  • image (Union[str, bytes]) – The image

  • mime_type (str) – None for default

add()#

Add this message to the completion query

class dataikuapi.dss.llm.DSSLLMCompletionResponse(raw_resp=None, text=None, finish_reason=None, response_parser=None, trace=None)#

Response to a completion

property json#
Returns:

LLM response parsed as a JSON object

property parsed#
property success#
Returns:

The outcome of the completion query.

Return type:

bool

property text#
Returns:

The raw text of the LLM response.

Return type:

Union[str, None]

property tool_calls#
Returns:

The tool calls of the LLM response.

Return type:

Union[list, None]

property log_probs#
Returns:

The log probs of the LLM response.

Return type:

Union[list, None]

property trace#
class dataikuapi.dss.llm.DSSLLMEmbeddingsQuery(llm, text_overflow_mode)#

A handle to interact with an embedding query. Embedding queries allow you to transform text into embedding vectors using a DSS-managed model.

Important

Do not create this class directly, use dataikuapi.dss.llm.DSSLLM.new_embeddings() instead.

add_text(text)#

Add text to the embedding query.

Parameters:

text (str) – Text to add to the query.

add_image(image)#

Add an image to the embedding query.

Parameters:

image – Image content as bytes or str (base64)

execute()#

Run the embedding query.

Returns:

The results of the embedding query.

Return type:

DSSLLMEmbeddingsResponse

class dataikuapi.dss.llm.DSSLLMEmbeddingsResponse(raw_resp)#

A handle to interact with an embedding query result.

Important

Do not create this class directly, use dataikuapi.dss.llm.DSSLLMEmbeddingsQuery.execute() instead.

get_embeddings()#

Retrieve vectors resulting from the embeddings query.

Returns:

A list of lists containing all embedding vectors.

Return type:

list

class dataikuapi.dss.llm.DSSLLMImageGenerationQuery(llm)#

A handle to interact with an image generation query.

Important

Do not create this class directly, use dataikuapi.dss.llm.DSSLLM.new_images_generation() instead.

with_prompt(prompt, weight=None)#

Add a prompt to the image generation query.

Parameters:
  • prompt (str) – The prompt text.

  • weight (float) – Optional weight between 0 and 1 for the prompt.

with_negative_prompt(prompt, weight=None)#

Add a negative prompt to the image generation query.

Parameters:
  • prompt (str) – The prompt text.

  • weight (float) – Optional weight between 0 and 1 for the negative prompt.

with_original_image(image, mode=None, weight=None)#

Add an image to the generation query.

To edit specific pixels of the original image. A mask can be applied by calling with_mask():

>>> query.with_original_image(image, mode="INPAINTING") # replace the pixels using a mask

To edit an image:

>>> query.with_original_image(image, mode="MASK_FREE") # edit the original image according to the prompt
>>> query.with_original_image(image, mode="VARY") # generates a variation of the original image
Parameters:
  • image (Union[str, bytes]) – The original image as str in base 64 or bytes.

  • mode (str) – The edition mode. Modes support varies across models/providers.

  • weight (float) – The original image weight between 0 and 1.

with_mask(mode, image=None)#

Add a mask for edition to the generation query. Call this method alongside with_original_image().

To edit parts of the image using a black mask (replace the black pixels):

>>> query.with_mask("MASK_IMAGE_BLACK", image=black_mask)

To edit parts of the image that are transparent (replace the transparent pixels):

>>> query.with_mask("ORIGINAL_IMAGE_ALPHA")
Parameters:
  • mode (str) – The mask mode. Modes support varies across models/providers.

  • image (Union[str, bytes]) – The mask image to apply to the image edition. As str in base 64 or bytes.

property height#
Returns:

The generated image height in pixels.

Return type:

Optional[int]

property width#
Returns:

The generated image width in pixels.

Return type:

Optional[int]

property fidelity#
Returns:

From 0.0 to 1.0, how strongly to adhere to prompt.

Return type:

Optional[float]

property quality#
Returns:

Quality of the image to generate. Valid values depend on the targeted model.

Return type:

Optional[str]

property seed#
Returns:

Seed of the image to generate, gives deterministic results when set.

Return type:

Optional[int]

property style#
Returns:

Style of the image to generate. Valid values depend on the targeted model.

Return type:

Optional[str]

property images_to_generate#
Returns:

Number of images to generate per query. Valid values depend on the targeted model.

Return type:

Optional[int]

property aspect_ratio#
Returns:

The width/height aspect ratio or None if either is not set.

Return type:

Optional[float]

execute()#

Executes the image generation

Return type:

DSSLLMImageGenerationResponse

class dataikuapi.dss.llm.DSSLLMImageGenerationResponse(raw_resp)#

A handle to interact with an image generation response.

Important

Do not create this class directly, use dataikuapi.dss.llm.DSSLLMImageGenerationQuery.execute() instead.

property success#
Returns:

The outcome of the image generation query.

Return type:

bool

first_image(as_type='bytes')#
Parameters:

as_type (str) – The type of image to return, ‘bytes’ for bytes otherwise ‘str’ for base 64 str.

Returns:

The first generated image as bytes or str depending on the as_type parameter.

Return type:

Union[bytes,str]

get_images(as_type='bytes')#
Parameters:

as_type (str) – The type of images to return, ‘bytes’ for bytes otherwise ‘str’ for base 64 str.

Returns:

The generated images as bytes or str depending on the as_type parameter.

Return type:

Union[List[bytes], List[str]]

property images#
Returns:

The generated images in bytes format.

Return type:

List[bytes]

class dataikuapi.dss.knowledgebank.DSSKnowledgeBankListItem(client, data)#

An item in a list of knowledege banks

Important

Do not instantiate this class directly, instead use dataikuapi.dss.project.DSSProject.list_knowledge_banks().

to_knowledge_bank()#

Convert the current item.

Returns:

A handle for the knowledge_bank.

Return type:

dataikuapi.dss.knowledgebank.DSSKnowledgeBank

as_core_knowledge_bank()#

Get the dataiku.KnowledgeBank object corresponding to this knowledge bank

Return type:

dataiku.KnowledgeBank

property project_key#
Returns:

The project

Return type:

string

property id#
Returns:

The id of the knowledge bank.

Return type:

string

property name#
Returns:

The name of the knowledge bank.

Return type:

string

class dataikuapi.dss.knowledgebank.DSSKnowledgeBank(client, project_key, id)#

A handle to interact with a DSS-managed knowledge bank.

Important

Do not create this class directly, use dataikuapi.dss.project.DSSProject.get_knowledge_bank() instead.

as_core_knowledge_bank()#

Get the dataiku.KnowledgeBank object corresponding to this knowledge bank

Return type:

dataiku.KnowledgeBank

class dataiku.KnowledgeBank(id, project_key=None)#

This is a handle to interact with a Dataiku Knowledge Bank flow object

as_langchain_retriever(search_type='similarity', search_kwargs=None, **retriever_args)#

Get this Knowledge bank as a Langchain Retriever object

as_langchain_vectorstore()#

Get this Knowledge bank as a Langchain Vectorstore object

class dataikuapi.dss.langchain.DKULLM(*args: Any, **kwargs: Any)#

Langchain-compatible wrapper around Dataiku-mediated LLMs

Note

Direct instantiation of this class is possible from within DSS, though it’s recommended to instead use dataikuapi.dss.llm.DSSLLM.as_langchain_llm().

Example:

llm = dkullm.as_langchain_llm()

# single prompt
print(llm.invoke("tell me a joke"))

# multiple prompts with batching
for response in llm.batch(["tell me a joke in English", "tell me a joke in French"]):
    print(response)

# streaming, with stop sequence
for chunk in llm.stream("Explain photosynthesis in a few words in English then French", stop=["dioxyde de"]):
    print(chunk, end="", flush=True)
llm_id: str#

LLM identifier to use

max_tokens: int = 1024#

Denotes the number of tokens to predict per generation.

temperature: float = 0#

A non-negative float that tunes the degree of randomness in generation.

top_k: int = None#

Number of tokens to pick from when sampling.

top_p: float = None#

Sample from the top tokens whose probabilities add up to p.

class dataikuapi.dss.langchain.DKUChatModel(*args: Any, **kwargs: Any)#

Langchain-compatible wrapper around Dataiku-mediated chat LLMs

Note

Direct instantiation of this class is possible from within DSS, though it’s recommended to instead use dataikuapi.dss.llm.DSSLLM.as_langchain_chat_model().

Example:

from langchain_core.prompts import ChatPromptTemplate

llm = dkullm.as_langchain_chat_model()
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
chain = prompt | llm
for chunk in chain.stream({"topic": "parrot"}):
    print(chunk.content, end="", flush=True)
llm_id: str#

LLM identifier to use

max_tokens: int = 1024#

Denotes the number of tokens to predict per generation.

temperature: float = 0#

A non-negative float that tunes the degree of randomness in generation.

top_k: int = None#

Number of tokens to pick from when sampling.

top_p: float = None#

Sample from the top tokens whose probabilities add up to p.

bind_tools(tools: Sequence[Dict[str, Any] | Type[pydantic.BaseModel] | Callable | langchain_core.tools.BaseTool], tool_choice: dict | str | Literal['auto', 'none', 'required', 'any'] | bool | None = None, strict: bool | None = None, compatible: bool | None = None, **kwargs: Any)#

Bind tool-like objects to this chat model.

Args:
tools: A list of tool definitions to bind to this chat model.

Can be a dictionary, pydantic model, callable, or BaseTool. Pydantic models, callables, and BaseTools will be automatically converted to their schema dictionary representation.

tool_choice: Which tool to request the model to call.
Options are:
  • name of the tool (str): call the corresponding tool;

  • “auto”: automatically select a tool (or no tool);

  • “none”: do not call a tool;

  • “any” or “required”: force at least one tool call;

  • True: call the one given tool (requires tools to be of length 1);

  • a dict of the form: {“type”: “tool_name”, “name”: “<<tool_name>>”}, or {“type”: “required”}, or {“type”: “any”} or {“type”: “none”}, or {“type”: “auto”};

strict: If specified, request the model to produce a JSON tool call that adheres to the provided schema. Support varies across models/providers. compatible: Allow DSS to modify the schema in order to increase compatibility, depending on known limitations of the model/provider. Defaults to automatic.

kwargs: Any additional parameters to bind.

class dataikuapi.dss.langchain.DKUEmbeddings(*args: Any, **kwargs: Any)#

Langchain-compatible wrapper around Dataiku-mediated embedding LLMs

Note

Direct instantiation of this class is possible from within DSS, though it’s recommended to instead use dataikuapi.dss.llm.DSSLLM.as_langchain_embeddings().

llm_id: str#

LLM identifier to use

embed_documents(texts: List[str]) List[List[float]]#

Call out to Dataiku-mediated LLM

Args:

texts: The list of texts to embed.

Returns:

List of embeddings, one for each text.

async aembed_documents(texts: List[str]) List[List[float]]#
embed_query(text: str) List[float]#
async aembed_query(text: str) List[float]#