Script a blueprint migration#

Dataiku Govern requires a Python script to define the conditions for blueprint migrations. Blueprint migration scripts allow you to control how an artifact’s information is mapped from one blueprint version to another. This page provides some guidance about how to script a blueprint migration.

Note

The migration script cannot edit the structure of blueprint versions. It can only map the information of an artifact within the framework of existing blueprint versions.

See also

If you are only interested in applying blueprint migrations, check out How-to | Switch artifact templates (blueprint versions).

Pre-populated lines#

The pre-populated lines in the script establish the source and target blueprints.

from govern.core.migration_handler import get_migration_handler

handler = get_migration_handler()
### Get the artifact to migrate
target_artifact = handler.target_artifact
### Get the source enriched artifact
source_enriched_artifact = handler.source_enriched_artifact
### Get the target blueprint version definition
target_blueprint_version = handler.target_blueprint_version

There are also a few lines that are commented out that may help you do things like rename the target artifact, create default target fields, etcetera.

Note

source_enriched_artifact contains multiple objects:

  • the source blueprint (source_enriched_artifact.blueprint)

  • the source blueprint version (source_enriched_artifact.blueprint_version)

  • the source artifact (source_enriched_artifact.artifact)

The most important one that needs to be manipulated is target_artifact.

Important

A key point is that the target_artifact object in the migration script is a copy of the incoming artifact that will be migrated.

Managing fields#

When writing a migration script, one main task is to migrate the fields. For each field being migrated, there are four possible cases to consider:

Case

Old blueprint version

New blueprint version

Case 1

Field exists

Field exists with the same definition

Case 2

Field exists

Field exists with different definition

Case 3

Field exists

Field doesn’t exist

Case 4

Field doesn’t exist

Field exists

Note

In the examples below, fields is a python dictionary and is accessed with target_artifact.fields.

Case 1#

The field already exists in the old BPV, and the field still exists in the new BPV with the same definition.

→ Nothing to do in the migration script, because the copied value is valid.

Case 2#

The field already exists in the old BPV, and the field still exists in the new BPV but with a different definition.

→ The field must be overridden in the target_artifact to match the new definition because the copied value won’t be valid.

Example: changed field2 from a text to a list of text#

Error when applying the migration:

Error executing migration mp.mig1: The migrated artifact generated by the migration is not valid. Field field2 is a list. , caused by: ValidationException: Field field2 is a list.

Possible solution in migration script:

# Wrap the existing value to become the first element of the list
fields['field2'] = [fields['field2']]

Case 3#

The field exists in the previous BPV, and the field is removed in the new BPV.

→ The field must be removed from target_artifact

Example: removed field3 in the target bpv#

Error when applying the migration:

Error executing migration mp.mig1: The migrated artifact generated by the migration is not valid. Some fields of Artifact were not found in its blueprint version reference. Field names: field3, caused by: ValidationException: Some fields of Artifact were not found in its blueprint version reference. Field names: field3

Possible solution in migration script:

# Remove the field safely (does nothing if the field doesn't exist)
fields.pop('field3', None)

Case 4#

The field does not exist in the previous BPV, and the field is created in the new BPV.

→ The field can/must (depending on the mandatory checkbox) be set in the target_artifact.

Example: added a mandatory text field4 in the target BPV#

Error when applying the migration:

Error executing migration mp.mig1: The migrated artifact generated by the migration is not valid. Field ID ‘field4’ is required, caused by: ValidationException: Field ID ‘field4’ is required`

Possible solution in migration script:

# Value the new field with a static value
fields['field4'] = 'new value'

To sum up:#

For the migration to be successful, all the fields must be correctly handled according to one of the cases 1, 2, 3 or 4.

Managing workflow steps#

To change the current step:

# Change the current step to be 'new_step'
target_artifact.json.get("status", {})["stepId"] = "new_step"

TLDR#

Here is a short list of potential manipulations.

Action

Code

Define a default value for a new field

fields["new_field"] = "value"

Remove values from a deleted field

fields.pop("old_field", None)

Move the data from one field to another

fields["new_field"] = fields["old_field"]

Change the current workflow step

target_artifact.json.get("status", {})["stepId"] = "new_step"

If you want a new field to appear empty in the target blueprint, you don’t need to define anything in the blueprint migration script. There are no values to migrate.