Connections#

The API exposes DSS connections, which can be created, modified and deleted through the API. These operations are restricted to API keys with the “admin rights” flag.

Getting the list of connections#

A list of the connections can by obtained with the list_connections() method:

client = DSSClient(host, apiKey)
dss_connections = client.list_connections()
prettyprinter.pprint(dss_connections)

outputs

{   'filesystem_managed': {   'allowManagedDatasets': True,
                               'allowMirror': False,
                               'allowWrite': True,
                               'allowedGroups': [],
                               'maxActivities': 0,
                               'name': 'filesystem_managed',
                               'params': {   'root': '${dip.home}/managed_datasets'},
                               'type': 'Filesystem',
                               'usableBy': 'ALL',
                               'useGlobalProxy': True},
    'hdfs_root':                  {    'allowManagedDatasets': True,
                                   'allowMirror': False,
                                   'allowWrite': True,
                                   'allowedGroups': [],
                                   'maxActivities': 0,
                                   'name': 'hdfs_root',
                                   'params': {'database': 'dataik', 'root': '/'},
                                   'type': 'HDFS',
                                   'usableBy': 'ALL',
                                   'useGlobalProxy': False},
    'local_postgress':    {    'allowManagedDatasets': True,
                               'allowMirror': False,
                               'allowWrite': True,
                               'allowedGroups': [],
                               'maxActivities': 0,
                               'name': 'local_postgress',
                               'params': { 'db': 'testdb',
                                           'host': 'localhost',
                                           'password': 'admin',
                                           'port': '5432',
                                           'properties': {   },
                                           'user': 'admin'},
                            'type': 'PostgreSQL',
                            'usableBy': 'ALL',
                            'useGlobalProxy': False},
    ...
}

Creating a connection#

Connections can be added:

new_connection_params = {'db':'mysql_test', 'host': 'localhost', 'password': 'admin', 'properties': [{'name': 'useSSL', 'value': 'true'}], 'user': 'admin'}
new_connection = client.create_connection('test_connection', type='MySql', params=new_connection_params, usable_by='ALLOWED', allowed_groups=['administrators','data_team'])
prettyprinter.pprint(client.list_connections()['test_connection'])

outputs

{   'allowManagedDatasets': True,
    'allowMirror': True,
    'allowWrite': True,
    'allowedGroups': ['data_scientists'],
    'maxActivities': 0,
    'name': 'test_connection',
    'params': {   'db': 'mysql_test',
                   'host': 'localhost',
                   'password': 'admin',
                   'properties': {   },
                   'user': 'admin'},
    'type': 'MySql',
    'usableBy': 'ALLOWED',
    'useGlobalProxy': True}

Modifying a connection#

To modify a connection, it is advised to first retrieve the connection definition with a get_definition() call, alter the definition, and set it back into DSS:

connection_definition = new_connection.get_definition()
connection_definition['usableBy'] = 'ALL'
connection_definition['allowWrite'] = False
new_connection.set_definition(connection_definition)
prettyprinter.pprint(new_connection.get_definition())

outputs

{   'allowManagedDatasets': True,
    'allowMirror': True,
    'allowWrite': False,
    'allowedGroups': ['data_scientists'],
    'maxActivities': 0,
    'name': 'test_connection',
    'params': {   'db': 'mysql_test',
                   'host': 'localhost',
                   'password': 'admin',
                   'properties': {   },
                   'user': 'admin'},
    'type': 'MySql',
    'usableBy': 'ALL',
    'useGlobalProxy': True}

Deleting a connection#

Connections can be deleted through their handle:

connection = client.get_connection('test_connection')
connection.delete()

Detailed examples#

This section contains more advanced examples on Connections.

Mass-change filesystem Connections#

You can programmatically switch all Datasets of a Project from a given filesystem Connection to a different one, thus reproducing the “Change Connection” action available in the Dataiku Flow UI.

import dataiku

def mass_change_connection(project, origin_conn, dest_conn):
    """Mass change dataset connections in a project (filesystem connections only)
    """

    all_datasets = project.list_datasets()
    for d in all_datasets():
        ds = project.get_dataset(d["name"])
        ds_def = ds.get_definition()
        if ds_def["type"] == "Filesystem":
            if ds_def["params"]["connection"] == origin_conn:
                ds_def["params"]["connection"] == dest_conn
                ds.set_definition(ds_def)

client = dataiku.api_client()
project = client.get_default_project()
mass_change_connection(project, "FSCONN_SOURCE", "FSCONN_DEST")

Reference documentation#

dataikuapi.dss.admin.DSSConnection(client, name)

A connection on the DSS instance.

dataikuapi.dss.admin.DSSConnectionInfo(data)

A class holding read-only information about a connection.

dataikuapi.dss.admin.DSSConnectionListItem(...)

An item in a list of connections.

dataikuapi.dss.admin.DSSConnectionSettings(...)

Settings of a DSS connection.

dataikuapi.dss.admin.DSSConnectionDetailsReadability(data)

Handle on settings for access to connection details.