Editing & Debugging Code with VS Code#
In this article, we will cover the main features and use cases of VSCode for Code Studio. If you want to know what can be edited in Code Studio and thus within VS Code, please refer to this page.
Caution
VSCode for Code Studio provides a richer edition environment, but doesn’t act as a full replacement for the Dataiku user interface. Several important operations will still be done visually in your Dataiku. For example, when you execute a code recipe from Code Studio, it doesn’t trigger a Dataiku job in your project.
Prerequisites#
A Dataiku 11+ instance.
Administrator privileges for your user profile.
A Kubernetes cluster is configured. For details, visit Elastic AI Computation.
A base image is built. Typically, this is built using a command such as
./bin/dssadmin build-base-image --type container-exec. For details, visit Build the Base Image.
Administrator privileges for your user profile.
This tutorial was written with Python 3.9, and the following package versions in a Dataiku Code Environment (to be added to the Code Studio template):
blackpylint
Creating the project#
From the Dataiku homepage, click +New Project > Learning projects > Developer > My First Code Studio.
From the project homepage, click Go to Flow.
Note
You can also download the starter project and import it as a zip file.
Use case summary#
We’ll work with a project that contains a simple pipeline: one input dataset, two Python recipes, and two output datasets. Both recipes generate errors when run. Our goal is to debug these recipes in your own IDE. We’ll accomplish this within Dataiku using Code Studios.
See also
There are other ways to debug code recipes with Dataiku. You may also consider using various IDE extensions.
The first thing we’ll need is a Code Studio template. If you don’t have a Code Studio template, please refer to this tutorial.
Editing a code recipe within VS Code#
In a Dataiku project, you can start a Code Studio instance from a code recipe with the “Edit in Code Studio” button on the top right of each code recipe.

A pop-up will then ask you to select the Code Studio instance to use.
After editing your recipe in VSCode, click “Sync files with DSS” to update the file content on the Dataiku server.
Remember, changes in a Code Studio running in a Kubernetes pod are not saved unless you click “Sync files with DSS”. For more on Code Studio, see Technical Details in Dataiku’s reference documentation.

From a Code Studio instance, several directories are available, as you can see in the VSCode file explorer on the left.

Debugging within VS Code#
Let’s inspect and debug the compute_contacts_1. recipe in Code Studio.
From the recipe, select Edit in Code Studio.
In Code Studios, select VS Code.
Dataiku displays the VS Code Workspace Explorer ready to debug the recipe.
Tip
To go back and forth between the Flow and your Code Studio, you can keep the VS Code Workspace Explorer open in its own browser tab.
We are interested in working with the Python recipe, compute_contacts_1. To find it:
Open the Recipes folder (
recipes).Select
compute_contacts_1.py.Run the code to generate the errors you saw when running the recipe, if you have done it.
Note
VSCode might warn you that the Python Interpreter is not set up.
Run the command Python: Select Interpreter (via the command palette)
and choose a Python interpreter (located in dataiku-python-code-envs).
Running the recipe in VS Code displays the same error we saw in the Flow. This lets us know the Code Studio is configured correctly. By looking at the result of this execution, you will see (in the Traceback) that the error comes from line 19.
You can work with the code recipe within your own IDE, all from Dataiku. However, you are now working in VS Code, rather than in the Dataiku Python recipe editor. If you make any changes to the code from the IDE, you’ll need to sync the changes back to Dataiku. Since we suspect the error is occurring before line 19, let’s set a breakpoint and use the VS Code debugger.
Click in the far left margin before line 19 to set a breakpoint.
Select Debug Python File from the dropdown at the top right, or from menu > *Run > Start Debugging.

VS Code executes the code and pauses at the breakpoint. To debug the code, you can utilise navigation commands and shortcuts in the IDE. More specifically, you can inspect the variables.
Expand Variables > Locals in the debugger explorer, in the left panel.
Upon inspection, you can see that the variable
valueis fetched from the project variables. If you want to see the definition of the project variables, select … > Variables from the top navigation bar of the project.You can change the value of this variable directly in VS Code, and then click the Continue button to see if it resolves the problem.
Now, you know that the error originates from the definition of the variable:
value.Edit the code, replacing
my_varwithmy_var2on line 16.value = dataiku.get_custom_variables()["my_var2"]
Run the code again.
Now that the code executes without error, you can sync the changes back to the recipe in the Flow.
Syncing the changes back to Dataiku#
When working in Code Studios, you are editing a local copy of your code separate from the version in Dataiku. After making changes, you must click “Sync files with DSS” in VS Code to update your project in Dataiku. If you return to the Flow without syncing, the changes will not appear in the Flow or in Dataiku’s interface. Always sync before switching back to Dataiku to ensure your edits are saved.
In VS Code, select Sync Files With DSS in the upper right.
Once the sync is complete, VS Code displays a green checkbox.
Return to the Flow.
Open the
compute_contacts_1Python recipe.You can see that the recipe is updated and that
"my_var"is now"my_var2".Run the recipe.
The recipe runs without warnings.
Python recipe successfully edited and synchronized back to Dataiku.#
Editing a project library file#
Project libraries are a great way to organise your code in a centralised location that can be reused in any project on the instance. From Dataiku, you can also connect to a remote Git repository to manage your code. For more details, visit Reusing Python Code.
In this section, you’ll practice editing a project library in Code Studio. You’ll be working with the second Python recipe in the project.
Running the Python recipe#
Return to the Flow.
Run the Python recipe that
generates contacts_2.
This recipe is performing a simple transformation using a custom Python package, my_package.
Custom Python package in the project library.#
The error, “list index out of range”, is raised at line 21 of the code.
row['new_feat'] = extract_domain(row['Email'])
You need to investigate this error to learn more. One way to do this is to use the logs, but you can also inspect and debug this error in Code Studio.
Debugging with VS Code#
Let’s see if you can find out more by using the VS Code debugger.
From the recipe that
generates contacts_2, select Edit in Code Studio.In Code Studios, select VS Code.
Dataiku displays the VS Code Workspace Explorer ready to debug the recipe.
The project-lib-versioned folder contains the Python package, my_package.
In addition, the recipes folder contains the recipes.
Let’s run the recipe in the debugger.
Open the Recipes folder (
recipes).Select
compute_contacts_2.py.Select Debug Python File.
Running the recipe in VS Code displays the same error we saw in the Flow.
Using the same technique as before, you can spot the error and fix it.
For example, you can use the code below for the extract_domain function in the project library.
Once you have chosen a fix, the code should run without error.
import re
def extract_domain(name):
split_name = re.split("\.|,",name)
if len(split_name) > 1 :
return split_name[1]
return '(unknown)'
Syncing the changes back to Dataiku#
Let’s sync the changes back to the recipe in the Flow.
In VS Code, select Sync Files With DSS in the upper right.
Dataiku synchronises both the recipe and the project library file back to the project. Once the sync is complete, VS Code displays a green checkbox. Let’s verify that the project library file has been updated.
Run the recipe that generates
contacts_2to see that the output dataset is built without exceptions.
Using Code Studio to edit code in a Git reference#
If you have imported code from Git in Dataiku Project Libraries, you will be able to edit this code within Code Studio. Committing the changes made in Code Studio to the Git reference is a 2-step process:
Edit the files in the
project-lib-versionedfolder in Code Studio and click Sync files with DSS.Return to Dataiku Project Libraries and click Commit and push all….

Defining Custom User Settings#
By default, VSCode for Code Studio user settings are stored under the user-versioned directory.
For example, if you change the color theme,
it will be saved in /home/dataiku/workspace/user-versioned/settings/code-server/User/settings.json.
If you click Sync File with DSS the VSCode settings.json file will be stored in your user profile
and be applied across all your user code studio sessions.
You can find all the files under the user-versioned directory in your Dataiku User Profile > My Files tab.
Wrapping up#
Congratulations, you should now have a functional setup to leverage VSCode for Code Studio allowing you to edit your code in Dataiku as if you were working with your local VSCode.
