LLM Mesh Integrations#
For the overall proposed structure, see LLM Mesh.
LangChain adapters#
- class dataikuapi.dss.langchain.DKULLM(*args: Any, **kwargs: Any)#
Langchain-compatible wrapper around Dataiku-mediated LLMs
Note
Direct instantiation of this class is possible from within DSS, though it’s recommended to instead use
dataikuapi.dss.llm.DSSLLM.as_langchain_llm().Example:
llm = dkullm.as_langchain_llm() # single prompt print(llm.invoke("tell me a joke")) # multiple prompts with batching for response in llm.batch(["tell me a joke in English", "tell me a joke in French"]): print(response) # streaming, with stop sequence for chunk in llm.stream("Explain photosynthesis in a few words in English then French", stop=["dioxyde de"]): print(chunk, end="", flush=True)
- llm_id: str#
LLM identifier to use
- max_tokens: int = None#
Denotes the number of tokens to predict per generation. Deprecated: use key “maxOutputTokens” in field “completion_settings”.
- temperature: float = None#
A non-negative float that tunes the degree of randomness in generation. Deprecated: use key “temperature” in field “completion_settings”.
- top_k: int = None#
Number of tokens to pick from when sampling. Deprecated: use key “topK” in field “completion_settings”.
- top_p: float = None#
Sample from the top tokens whose probabilities add up to p. Deprecated: use key “topP” in field “completion_settings”.
- completion_settings: dict = {}#
Settings applied to completion queries, all keys are optional and can include: maxOutputTokens, temperature, topK, topP, frequencyPenalty, presencePenalty, logitBias, logProbs and topLogProbs.
- class dataikuapi.dss.langchain.DKUChatModel(*args: Any, **kwargs: Any)#
Langchain-compatible wrapper around Dataiku-mediated chat LLMs
Note
Direct instantiation of this class is possible from within DSS, though it’s recommended to instead use
dataikuapi.dss.llm.DSSLLM.as_langchain_chat_model().Example:
from langchain_core.prompts import ChatPromptTemplate llm = dkullm.as_langchain_chat_model() prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}") chain = prompt | llm for chunk in chain.stream({"topic": "parrot"}): print(chunk.content, end="", flush=True)
- llm_id: str#
LLM identifier to use
- max_tokens: int = None#
Denotes the number of tokens to predict per generation. Deprecated: use key “maxOutputTokens” in field “completion_settings”.
- temperature: float = None#
A non-negative float that tunes the degree of randomness in generation. Deprecated: use key “temperature” in field “completion_settings”.
- top_k: int = None#
Number of tokens to pick from when sampling. Deprecated: use key “topK” in field “completion_settings”.
- top_p: float = None#
Sample from the top tokens whose probabilities add up to p. Deprecated: use key “topP” in field “completion_settings”.
- completion_settings: dict = {}#
Settings applied to completion queries, all keys are optional and can include: maxOutputTokens, temperature, topK, topP, frequencyPenalty, presencePenalty, logitBias, logProbs and topLogProbs.
- bind_tools(tools: Sequence[Dict[str, Any] | Type[pydantic.BaseModel] | Callable | langchain_core.tools.BaseTool], tool_choice: dict | str | Literal['auto', 'none', 'required', 'any'] | bool | None = None, strict: bool | None = None, compatible: bool | None = None, **kwargs: Any)#
Bind tool-like objects to this chat model.
- Args:
- tools: A list of tool definitions to bind to this chat model.
Can be a dictionary, pydantic model, callable, or BaseTool. Pydantic models, callables, and BaseTools will be automatically converted to their schema dictionary representation.
- tool_choice: Which tool to request the model to call.
- Options are:
name of the tool (str): call the corresponding tool;
“auto”: automatically select a tool (or no tool);
“none”: do not call a tool;
“any” or “required”: force at least one tool call;
True: call the one given tool (requires tools to be of length 1);
a dict of the form: {“type”: “tool_name”, “name”: “<<tool_name>>”}, or {“type”: “required”}, or {“type”: “any”} or {“type”: “none”}, or {“type”: “auto”};
strict: If specified, request the model to produce a JSON tool call that adheres to the provided schema. Support varies across models/providers. compatible: Allow DSS to modify the schema in order to increase compatibility, depending on known limitations of the model/provider. Defaults to automatic.
kwargs: Any additional parameters to bind.
- class dataikuapi.dss.langchain.DKUEmbeddings(*args: Any, **kwargs: Any)#
Langchain-compatible wrapper around Dataiku-mediated embedding LLMs
Note
Direct instantiation of this class is possible from within DSS, though it’s recommended to instead use
dataikuapi.dss.llm.DSSLLM.as_langchain_embeddings().- llm_id: str#
LLM identifier to use
- embed_documents(texts: List[str]) List[List[float]]#
Call out to Dataiku-mediated LLM
- Args:
texts: The list of texts to embed.
- Returns:
List of embeddings, one for each text.
- async aembed_documents(texts: List[str]) List[List[float]]#
- embed_query(text: str) List[float]#
- async aembed_query(text: str) List[float]#
- class dataikuapi.dss.langchain.knowledge_bank.DKUKnowledgeBankRetriever(*args: Any, **kwargs: Any)#
Langchain-compatible retriever for a knowledge bank
Important
Do not instantiate directly, use
dataikuapi.dss.knowledgebank.DSSKnowledgeBank.as_langchain_retriever()instead- SEARCH_PARAMETERS_NAMES: ClassVar = ['max_documents', 'search_type', 'similarity_threshold', 'mmr_documents_count', 'mmr_factor', 'hybrid_use_advanced_reranking', 'hybrid_rrf_rank_constant', 'hybrid_rrf_rank_window_size', 'filter']#
Valid parameter names for the search method
