Model Evaluation Stores#

For usage information and examples, see Model Evaluation Stores

There are two main parts related to the handling of model evaluation stores in Dataiku’s Python APIs:

dataiku.core.model_evaluation_store.ModelEvaluationStore and dataiku.core.model_evaluation_store.ModelEvaluation in the dataiku package. They were initially designed for usage within DSS.
dataikuapi.dss.modelevaluationstore.DSSModelEvaluationStore and dataikuapi.dss.modelevaluationstore.DSSModelEvaluation in the dataikuapi package. They were initially designed for usage outside of DSS.

Both set of classes have fairly similar capabilities.

dataiku package API#

class dataiku.ModelEvaluationStore(lookup, project_key=None, ignore_flow=False)#

This is a handle to interact with a model evaluation store.

Note: this class is also available as dataiku.core.model_evaluation_store.ModelEvaluationStore

get_info(sensitive_info=False)#

Gets information about the location and settings of this model evaluation store.

Parameters:: sensitive_info (bool) – flag for sensitive information such as the Model Evaluation Store absolute path (defaults to False)
Return type:: dict

get_path()#

Gets the filesystem path of this model evaluation store.

Return type:: str

get_id()#

Gets the id of this model evaluation store.

Return type:: str

get_name()#

Gets the name of this model evaluation store.

Return type:: str

list_runs()#

Gets the list of runs of this model evaluation store.

Return type:: list of dataiku.core.model_evaluation_store.ModelEvaluation

get_evaluation(evaluation_id)#

Gets a model evaluation from the store based on its id.

Parameters:: evaluation_id (str) – the id of the model evaluation to retrieve
Returns:: a dataiku.core.model_evaluation_store.ModelEvaluation handle on the model evaluation

get_last_metric_values()#

Gets the set of last values of the metrics on this folder.

Returns:: a dataiku.core.ComputedMetrics object

get_metric_history(metric_lookup)#

Gets the set of all values a given metric took on this folder.

Parameters:: metric_lookup – metric name or unique identifier
Return type:: dict

class dataiku.core.model_evaluation_store.ModelEvaluation(store, evaluation_id)#

This is a handle to interact with a model evaluation from a model evaluation store.

set_preparation_steps(steps, requested_output_schema, context_project_key=None)#

Sets the preparation steps of the input dataset in a model evaluation.

Parameters:

steps (dict) – the steps of the preparation
requested_output_schema (dict) – the schema of the prepared input dataset as a list of objects like this one: { 'type': 'string', 'name': 'foo', 'maxLength': 1000 }
context_project_key (str) – a different project key to use instead of the current project key, because the preparation steps can live in a different project than the dataset (defaults to None)

get_schema()#

Gets the schema of the sample used for this model evaluation. There is more information for the map, array and object types.

Return type:: list of dict
Returns:: a schema as a list of objects like this one: { 'type': 'string', 'name': 'foo', 'maxLength': 1000 }

get_dataframe(columns=None, infer_with_pandas=True, parse_dates=True, bool_as_str=False, float_precision=None)#

Reads the sample in the run as a Pandas dataframe.

Pandas dataframes are fully in-memory, so you need to make sure that your dataset will fit in RAM before using this.

Inconsistent sampling parameter raise ValueError.

Note about encoding:

Column labels are “unicode” objects
When a column is of string type, the content is made of utf-8 encoded “str” objects

Parameters:

columns (list of dict) – the columns with information on type, names, etc. e.g. { 'type': 'string', 'name': 'foo', 'maxLength': 1000 } (defaults to None)
infer_with_pandas (bool) – uses the types detected by pandas rather than the types from the dataset schema as detected in DSS (defaults to True)
parse_dates (bool) – parses date columns in DSS’s dataset schema (defaults to True)
bool_as_str (bool) – leaves boolean values as strings (defaults to False)
float_precision (str) – float precision for pandas read_table (defaults to None)

Returns:

a pandas.Dataframe representing the sample used in the evaluation

iter_dataframes_forced_types(names, dtypes, parse_date_columns, sampling=None, chunksize=10000, float_precision=None)#

Reads the model evaluation sample as Pandas dataframes by chunks of fixed size with forced types.

Returns a generator over pandas dataframes.

Useful is the sample doesn’t fit in RAM.

Parameters:

names (list of str) – names of the columns of the dataset
dtypes (list of str or object) – data types of the columns
parse_date_columns (bool) – parses date columns in DSS’s dataset schema
sampling – ignored at the moment (defaults to None)
chunksize (int) – the size of the dataframes yielded by the iterator (defaults to 10000)
float_precision (str) – float precision for pandas read_table (defaults to None)

Returns:

a generator of pandas.Dataframe

iter_dataframes(chunksize=10000, infer_with_pandas=True, parse_dates=True, columns=None, bool_as_str=False, float_precision=None)#

Read the model evaluation sample to Pandas dataframes by chunks of fixed size.

Returns a generator over pandas dataframes.

Useful is the sample doesn’t fit in RAM.

Parameters:

chunksize (int) – the size of the dataframes yielded by the iterator (defaults to 10000)
infer_with_pandas (bool) – uses the types detected by pandas rather than the dataset schema as detected in DSS (defaults to True)
parse_dates (bool) – parses date columns in DSS’s dataset schema (defaults to True)
columns (list of dict) – columns of the dataset as dict with names and dtypes (defaults to None)
bool_as_str (bool) – leaves boolean values as strings (defaults to False)
float_precision (str) – float precision for pandas read_table. For more information on this parameter, please check pandas documentation: https://pandas.pydata.org/docs/reference/api/pandas.read_table.html (defaults to None)

Returns:

a generator of pandas.Dataframe

dataikuapi package API#

class dataikuapi.dss.modelevaluationstore.DSSModelEvaluationStore(client, project_key, mes_id)#

A handle to interact with a model evaluation store on the DSS instance.

Warning

Do not create this directly, use dataikuapi.dss.project.DSSProject.get_model_evaluation_store()

property id#

get_settings()#

Returns the settings of this model evaluation store.

Return type:: DSSModelEvaluationStoreSettings

get_zone()#

Gets the flow zone of this model evaluation store

Return type:: dataikuapi.dss.flow.DSSFlowZone

move_to_zone(zone)#

Moves this object to a flow zone

Parameters:: zone (object) – a dataikuapi.dss.flow.DSSFlowZone where to move the object

share_to_zone(zone)#

Share this object to a flow zone

Parameters:: zone (object) – a dataikuapi.dss.flow.DSSFlowZone where to share the object

unshare_from_zone(zone)#

Unshare this object from a flow zone

Parameters:: zone (object) – a dataikuapi.dss.flow.DSSFlowZone from where to unshare the object

get_usages()#

Get the recipes referencing this model evaluation store

Returns:: a list of usages

get_object_discussions()#

Get a handle to manage discussions on the model evaluation store

Returns:: the handle to manage discussions
Return type:: dataikuapi.discussion.DSSObjectDiscussions

delete()#: Delete the model evaluation store

list_model_evaluations()#

List the model evaluations in this model evaluation store. The list is sorted by ME creation date.

Returns:: The list of the model evaluations
Return type:: list of dataikuapi.dss.modelevaluationstore.DSSModelEvaluation

get_model_evaluation(evaluation_id)#

Get a handle to interact with a specific model evaluation

Parameters:: evaluation_id (string) – the id of the desired model evaluation
Returns:: A dataikuapi.dss.modelevaluationstore.DSSModelEvaluation model evaluation handle

get_latest_model_evaluation()#

Get a handle to interact with the latest model evaluation computed

Returns:: A dataikuapi.dss.modelevaluationstore.DSSModelEvaluation model evaluation handle if the store is not empty, else None

delete_model_evaluations(evaluations)#: Remove model evaluations from this store

build(job_type='NON_RECURSIVE_FORCED_BUILD', wait=True, no_fail=False)#

Starts a new job to build this model evaluation store and wait for it to complete. Raises if the job failed.

job = mes.build()
print("Job %s done" % job.id)

Parameters:

job_type – The job type. One of RECURSIVE_BUILD, NON_RECURSIVE_FORCED_BUILD or RECURSIVE_FORCED_BUILD
wait – wait for the build to finish before returning
no_fail – if True, does not raise if the job failed. Valid only when wait is True

Returns:

the dataikuapi.dss.job.DSSJob job handle corresponding to the built job

Return type:

dataikuapi.dss.job.DSSJob

get_last_metric_values()#

Get the metrics of the latest model evaluation built

Returns:: a list of metric objects and their value

get_metric_history(metric)#

Get the history of the values of the metric on this model evaluation store

Returns:: an object containing the values of the metric, cast to the appropriate type (double, boolean,…)

compute_metrics(metric_ids=None, probes=None)#: Compute metrics on this model evaluation store. If the metrics are not specified, the metrics setup on the model evaluation store are used.

run_checks(evaluation_id='', checks=None)#

Run checks on a partition of this model evaluation store.

If the checks are not specified, the checks setup on the model evaluation store are used.

Parameters:

evaluation_id (str) – (optional) id of evaluation on which checks should be run. Last evaluation is used if not specified.
checks (list[string]) – (optional) ids of the checks to run.

Returns:

a checks computation report, as a dict.

Return type:

dict

class MetricDefinition(code, value, name=None, description=None)#

class LabelDefinition(key, value)#

add_custom_model_evaluation(metrics, evaluation_id=None, name=None, labels=None, model=None)#

Adds a model evaluation with custom metrics to the model evaluation store. :param list[DSSModelEvaluationStore.MetricDefinition] metrics: the metrics to add. :param str evaluation_id: the id of the evaluation (optional) :param str name: the human-readable name of the evaluation (optional) :param list[DSSModelEvaluationStore.LabelDefinition] labels: labels to set on the model evaluation (optionam). See below. :param model: saved model version (full ID or DSSTrainedPredictionModelDetails) of the evaluated model (optional) :type model: Union[str, DSSTrainedPredictionModelDetails]

Code sample:

import dataiku
from dataikuapi.dss.modelevaluationstore import DSSModelEvaluationStore

client=dataiku.api_client()
project=client.get_default_project()
mes=project.get_model_evaluation_store("7vFZWNck")

accuracy = DSSModelEvaluationStore.MetricDefinition("accuracy", 0.95, "Accuracy")
other = DSSModelEvaluationStore.MetricDefinition("other", 42, "Other", "Other metric desc")
label = DSSModelEvaluationStore.LabelDefinition("custom:myLabel", "myValue")

mes.add_custom_model_evaluation([accuracy, pouet], labels=[label])
mes.run_checks()

class dataikuapi.dss.modelevaluationstore.DSSModelEvaluationStoreSettings(model_evaluation_store, settings)#

A handle on the settings of a model evaluation store

Warning

Do not create this class directly, instead use dataikuapi.dss.modelevaluationstore.DSSModelEvaluationStore.get_settings()

get_raw()#

save()#

class dataikuapi.dss.modelevaluationstore.DSSModelEvaluation(model_evaluation_store, evaluation_id)#

A handle on a model evaluation

Warning

Do not create this class directly, instead use dataikuapi.dss.modelevaluationstore.DSSModelEvaluationStore.get_model_evaluation()

get_full_info()#

Retrieve the model evaluation with its performance data

Returns:: the model evaluation full info, as a dataikuapi.dss.modelevaluationstore.DSSModelEvaluationFullInfo

get_full_id()#

delete()#: Remove this model evaluation

property full_id#

compute_data_drift(reference=None, data_drift_params=None, wait=True)#

Compute data drift against a reference model or model evaluation. The reference is determined automatically unless specified.

Attention

Deprecated. Use dataikuapi.dss.modelevaluationstore.DSSModelEvaluationStore.compute_drift() instead

Parameters:

reference (Union[str, DSSModelEvaluation, DSSTrainedPredictionModelDetails]) – saved model version (full ID or DSSTrainedPredictionModelDetails) or model evaluation (full ID or DSSModelEvaluation) to use as reference (optional)
data_drift_params (DataDriftParams) – data drift computation settings as a dataikuapi.dss.modelevaluationstore.DataDriftParams (optional)
wait – data drift computation settings (optional)

Returns:

a dataikuapi.dss.modelevaluationstore.DataDriftResult containing data drift analysis results if wait is True, or a DSSFuture handle otherwise

compute_drift(reference=None, drift_params=None, wait=True)#

Compute drift against a reference model or model evaluation. The reference is determined automatically unless specified.

Parameters:

reference (Union[str, DSSModelEvaluation, DSSTrainedPredictionModelDetails]) – saved model version (full ID or DSSTrainedPredictionModelDetails) or model evaluation (full ID or DSSModelEvaluation) to use as reference (optional)
drift_params (DriftParams) – drift computation settings as a dataikuapi.dss.modelevaluationstore.DriftParams (optional)
wait – data drift computation settings (optional)

Returns:

a dataikuapi.dss.modelevaluationstore.DriftResult containing data drift analysis results if wait is True, or a DSSFuture handle otherwise

get_metrics()#

Get the metrics for this model evaluation. Metrics must be understood here as Metrics in DSS Metrics & Checks

Returns:: the metrics, as a JSON object

get_sample_df()#

Get the sample of the evaluation dataset on which the evaluation was performed

Returns:: the sample content, as a pandas.DataFrame

class dataikuapi.dss.modelevaluationstore.DSSModelEvaluationFullInfo(model_evaluation, full_info)#

A handle on the full information on a model evaluation.

Includes information such as the full id of the evaluated model, the evaluation params, the performance and drift metrics, if any, etc.

Warning

Do not create this class directly, instead use dataikuapi.dss.modelevaluationstore.DSSModelEvaluation.get_full_info()

metrics: dict#: The performance and data drift metric, if any.

creation_date: int#: The date and time of the creation of the model evaluation, as an epoch.

has_model#: The user-accessible metadata (name, labels) Returns the original object, not a copy. Changes to the returned object are persisted to DSS by calling save_user_meta().

get_raw()#

save_user_meta()#

class dataikuapi.dss.modelevaluationstore.DataDriftParams(data)#

Object that represents parameters for data drift computation.

Warning

Do not create this object directly, use dataikuapi.dss.modelevaluationstore.DataDriftParams.from_params() instead.

Attention

Deprecated. Use dataikuapi.dss.modelevaluationstore.DriftParams instead

static from_params(per_column_settings, nb_bins=10, compute_histograms=True, confidence_level=0.95)#

Creates parameters for data drift computation from columns, number of bins, compute histograms and confidence level

Parameters:

per_column_settings (dict) – A dict representing the per column settings. You should use a PerColumnDriftParamBuilder to build it.
nb_bins (int) – (optional) Nb. bins in histograms (apply to all columns) - default: 10
compute_histograms (bool) – (optional) Enable/disable histograms - default: True
confidence_level (float) – (optional) Used to compute confidence interval on drift’s model accuracy - default: 0.95

Return type:

dataikuapi.dss.modelevaluationstore.DataDriftParams

class dataikuapi.dss.modelevaluationstore.DriftParams(data)#

Object that represents parameters for drift computation.

Warning

Do not create this object directly, use dataikuapi.dss.modelevaluationstore.DriftParams.from_params() instead.

static from_params(per_column_settings, nb_bins=10, compute_histograms=True, confidence_level=0.95)#

Creates parameters for drift computation from columns, number of bins, compute histograms and confidence level

Parameters:

per_column_settings (dict) – A dict representing the per column settings. You should use a PerColumnDriftParamBuilder to build it.
nb_bins (int) – (optional) Nb. bins in histograms (apply to all columns) - default: 10
compute_histograms (bool) – (optional) Enable/disable histograms - default: True
confidence_level (float) – (optional) Used to compute confidence interval on drift’s model accuracy - default: 0.95

Return type:

dataikuapi.dss.modelevaluationstore.DriftParams

class dataikuapi.dss.modelevaluationstore.PerColumnDriftParamBuilder#

Builder for a map of per column drift params settings. Used as a helper before computing data drift to build columns param expected in dataikuapi.dss.modelevaluationstore.DataDriftParams.from_params().

build()#: Returns the built dict for per column drift params settings

with_column_drift_param(name, handling='AUTO', enabled=True)#

Sets the drift params settings for given column name.

Param:: string name: The name of the column
Param:: string handling: (optional) The column type, should be either NUMERICAL, CATEGORICAL or AUTO (default: AUTO)
Param:: bool enabled: (optional) False means the column is ignored in drift computation (default: True)

class dataikuapi.dss.modelevaluationstore.DataDriftResult(data)#

A handle on the data drift result of a model evaluation.

Warning

Do not create this class directly, instead use dataikuapi.dss.modelevaluationstore.DSSModelEvaluation.compute_data_drift()

drift_model_result#: Drift analysis based on drift modeling.

univariate_drift_result#: Per-column drift analysis based on pairwise comparison of distributions.

per_column_settings#: Information about column handling that has been used (errors, types, etc).

get_raw()#

Returns:: the raw data drift result
Return type:: dict

class dataikuapi.dss.modelevaluationstore.DriftResult(data)#

A handle on the drift result of a model evaluation.

Warning

Do not create this class directly, instead use dataikuapi.dss.modelevaluationstore.DSSModelEvaluation.compute_drift()

drift_model_result#: Drift analysis based on drift modeling.

univariate_drift_result#: Per-column drift analysis based on pairwise comparison of distributions.

per_column_settings#: Information about column handling that has been used (errors, types, etc).

prediction_drift_result#: Drift analysis based on the prediction column

get_raw()#

Returns:: the raw data drift result
Return type:: dict

class dataikuapi.dss.modelevaluationstore.DriftModelResult(data)#

A handle on the drift model result.

Warning

Do not create this class directly, instead use dataikuapi.dss.modelevaluationstore.DriftResult.drift_model_result

get_raw()#

Returns:: the raw drift model result
Return type:: dict

class dataikuapi.dss.modelevaluationstore.UnivariateDriftResult(data)#

A handle on the univariate data drift.

Warning

Do not create this class directly, instead use dataikuapi.dss.modelevaluationstore.DriftResult.univariate_drift_result

per_column_drift_data: dict#: Drift data per column, as a dict of column name -> drift data.

get_raw()#

Returns:: the raw univariate data drift
Return type:: dict

class dataikuapi.dss.modelevaluationstore.PredictionDriftResult(data)#

A handle on the prediction drift result.

Warning

Do not create this class directly, instead use dataikuapi.dss.modelevaluationstore.DriftResult.prediction_drift_result

get_raw()#

Returns:: the raw prediction drift
Return type:: dict

class dataikuapi.dss.modelevaluationstore.ColumnSettings(data)#

A handle on column handling information.

Warning

Do not create this class directly, instead use dataikuapi.dss.modelevaluationstore.DriftResult.get_per_column_settings()

actual_column_handling: str#: The actual column handling (either forced via drift params or inferred from model evaluation preprocessings). It can be any of NUMERICAL, CATEGORICAL, or IGNORED.

default_column_handling: str#: The default column handling (based on model evaluation preprocessing only). It can be any of NUMERICAL, CATEGORICAL, or IGNORED.

get_raw()#

Returns:: the raw column handling information
Return type:: dict

class dataikuapi.dss.modelevaluationstore.DriftModelAccuracy(data)#

A handle on the drift model accuracy.

Warning

Do not create this class directly, instead use dataikuapi.dss.modelevaluationstore.DriftModelResult.drift_model_accuracy

get_raw()#

Returns:: the drift model accuracy data
Return type:: dict