Utilities#

These classes are various utilities that are used in various parts of the API.

class dataikuapi.dss.utils.DSSDatasetSelectionBuilder#

Builder for a “dataset selection”. In DSS, a dataset selection is used to select a part of a dataset for processing.

Depending on the location where it is used, a selection can include: * Sampling * Filtering by partitions (for partitioned datasets) * Filtering by an expression * Selection of columns * Ordering

Please see the sampling documentation of DSS for a detailed explanation of the sampling methods.

build()#
Returns:

the built selection dict

Return type:

dict

with_head_sampling(limit)#

Sets the sampling to ‘first records’ mode

Parameters:

limit (int) – Maximum number of rows in the sample

with_all_data_sampling()#

Sets the sampling to ‘no sampling, all data’ mode

with_random_fixed_nb_sampling(nb)#

Sets the sampling to ‘Random sampling, fixed number of records’ mode

Parameters:

nb (int) – Maximum number of rows in the sample

with_selected_partitions(ids)#

Sets partition filtering on the given partition identifiers.

Warning

The dataset to select must be partitioned.

Parameters:

ids (list) – list of selected partitions

class dataikuapi.dss.utils.DSSFilterBuilder#

Builder for a “filter”. In DSS, a filter is used to define a subset of rows for processing.

build()#
Returns:

the built filter

Return type:

dict

with_distinct()#

Sets the filter to deduplicate

with_formula(expression)#

Sets the formula (DSS formula) used to filter rows

Parameters:

expression (str) – the DSS formula

class dataikuapi.dss.utils.DSSInfoMessages(data)#

Contains a list of dataikuapi.dss.utils.DSSInfoMessage.

Important

Do not instantiate this class.

property messages#

The messages as a list of dataikuapi.dss.utils.DSSInfoMessage

property has_messages#

True if there is any message

property has_error#

True if there is any error message

property max_severity#

The max severity of the messages

property has_success#

True if there is any success message

property has_warning#

True if there is any warning message

class dataikuapi.dss.utils.DSSInfoMessage(data)#

A message with a code, a title, a severity and a content.

Important

Do not instantiate this class.

property severity#

The severity of the message

property code#

The code of the message

property details#

The details of the message

property title#

The title of the message

property message#

The full message

class dataikuapi.dss.utils.DSSSimpleFilter(operator, column=None, value=None, clauses=None)#

A simplified representation of a DSS filter. It can be built from scratch or from an existing DSSFilter.

A simple filter is a dictionary with the following keys:

  • operator: one of the values of DSSSimpleFilterOperator

  • column: the column to apply the filter on (for unary and binary operators)

  • value: the value to compare with (for binary operators)

  • clauses: a list of other simple filters (for AND/OR operators)

to_dss_filter()#

Converts the simple filter to a DSS filter dictionary.

Returns:

A DSS filter dictionary that can be used in visual recipes.

Return type:

dict

static from_dss_filter(dss_filter)#

Converts a DSS filter dictionary to a simple filter.

Parameters:

dss_filter (dict) – A DSS filter dictionary.

Returns:

A simple filter object.

Return type:

DSSSimpleFilter

to_dict()#

Converts the simple filter to a serializable dictionary.

Returns:

A dictionary representation of the simple filter.

Return type:

dict

static and_(clauses)#
static or_(clauses)#
static eq(column, value)#
static neq(column, value)#
static gt(column, value)#
static gte(column, value)#
static lt(column, value)#
static lte(column, value)#
static empty(column)#
static not_empty(column)#
static contains(column, value)#
static matches(column, value)#
static in_any_of(column, values)#
static in_none_of(column, values)#
class dataikuapi.dss.utils.DSSSimpleFilterOperator(value)#

Operators for the DSSSimpleFilter.

EQUALS = 'EQUALS'#
NOT_EQUALS = 'NOT_EQUALS'#
GREATER_THAN = 'GREATER_THAN'#
LESS_THAN = 'LESS_THAN'#
GREATER_OR_EQUAL = 'GREATER_OR_EQUAL'#
LESS_OR_EQUAL = 'LESS_OR_EQUAL'#
DEFINED = 'DEFINED'#
NOT_DEFINED = 'NOT_DEFINED'#
CONTAINS = 'CONTAINS'#
MATCHES = 'MATCHES'#
IN_ANY_OF = 'IN_ANY_OF'#
IN_NONE_OF = 'IN_NONE_OF'#
AND = 'AND'#
OR = 'OR'#