The Dataiku Python APIs#

Code-savvy users of the Dataiku platform can interact with it using a complete set of Python APIs that are split between two packages, respectively called dataiku and dataiku-api-client. While they are often used together, their underlying primitives serve distinct purposes:

  • dataiku is for internal operations for data processing and machine learning tasks within the platform. It allows low-level interactions with core items such as datasets and saved models.

  • dataiku-api-client is a client for Dataiku’s public REST API, which is helpful in programmatically maintaining the platform or making it interact with other applications or systems.

Both packages can be used from the Dataiku platform’s web interface out of the box.

For a quick start, you should install the dataikuapi package directly from the PyPI repository:

pip install dataiku-api-client

Then, you must create an API key for accessing the instance (see Public API Keys). Once the API key has been defined, you can connect to your instance and perform some operations, like:

import dataikuapi
import random

client = dataikuapi.DSSClient("https://dss.example/", "YOURAPIKEY")

project_keys = client.list_project_keys()
project = client.get_project(random.choice(project_keys))
print(f"Known notebooks for the \"{project.get_summary().get('name')}\" project")
[jupy.notebook_name for jupy in project.list_jupyter_notebooks()]

Please refer to this tutorial for a deeper insight into the Dataiku API usage.

Note

If you edit code outside the platform (e.g., using the VSCode or PyCharm editor plugins), don’t forget to install the Dataiku Python APIs locally.

  • If you are a beginner user looking to get more familiar with the basics of Dataiku’s public API, start with this tutorial.

  • Check out the API reference section for complete documentation of the dataiku and dataiku-api-client packages.

In the rest of this Developer Guide, for the sake of simplicity, we won’t distinguish between dataiku and dataiku-api-client unless absolutely needed: we will refer to the “Dataiku Python APIs” instead.