Retrieve AWS Credentials from Dataiku API#

Prerequisites#

  • A machine running Dataiku

Introduction#

You may have the need to manage users permission for security and organization purposes. In Dataiku, it is possible to manage AWS permissions through code with the Python API. With these credentials, it is then possible to manage AWS resources.

Connecting AWS and Dataiku#

Before being able to retrieve AWS credentials for Dataiku resources you must:

  1. Ensure that the machine running Dataiku has an IAM instance profile.

  2. Enable a S3 connection with AssumeRole mode and Details readable by parameter. See more on this tutorial.

  3. As the role to assume, use ${adminProperty:associatedRole}.

  4. In each user’s settings, in Admin Properties, add an entry associatedRole with the name of the IAM role to assume.

  5. Ensure that the IAM instance profile of the machine running DSS can assume the needed roles.

Retrieving#

The following code allows you to retrieve AWS credentials with a Dataiku API call and the S3 connection previously enabled.

1import dataiku
2
3# to complete as needed
4MY_S3_CONNECTION = ""
5
6conn_info = dataiku.api_client().get_connection(MY_S3_CONNECTION).get_info()
7cred = conn_info.get_aws_credential()

The output cred is a dict that contains the Security Token Service (STS) token.

Note

If you run your code on an API endpoint, such as Kubernetes, you will have to initialize the Dataiku API client with dataikuapi.DSSClient() prior the steps described above.

This dictionary also contains access and secret keys, which allows to manage AWS resources. You can do it with boto3, which is the AWS Software Development Kit (SDK) to access AWS services.

As an example, the code below allows you to create a session and access your S3 buckets.

 1import boto3
 2
 3# Create a session using the retrieved credentials
 4session = boto3.Session(
 5    aws_access_key_id=cred['accessKey'],
 6    aws_secret_access_key=cred['secretKey'],
 7    aws_session_token=cred['sessionToken']
 8)
 9
10# Use the session to interact with AWS services, such as S3 for example
11s3 = session.client('s3')
12
13# Example: List all buckets in S3
14response = s3.list_buckets()

Wrapping up#

In this tutorial, you retrieved credentials to manage AWS resources on a machine running Dataiku.