This page briefly explains where the code is written and executed when working on a Dataiku DSS instance.
Tools for editing code#
In this section you will discover the various options at your disposal to edit code in Dataiku DSS depending on your use-case.
If you are looking for a way to interactively explore your data and experiment with small pieces of code, then notebooks are the way to go. They allow you to execute your code by consecutive blocks called cells, and visualize each cell output.
Dataiku DSS offers the ability to spawn complete code notebooks environments server-side:
SQL notebooks to run interactive queries on your SQL databases
Code notebooks to execute Python or R code in a simple-yet-effective interface based on Jupyter notebooks
All these solutions are natively embedded in the Dataiku DSS web interface to facilitate your navigation and easily share your work with other users on the same instance. Additionally, Python/R notebook sources (
.ipynb files) can be synchronized from/to remote Git repositories.
If you are already using an IDE like Visual Studio Code or PyCharm on your client machine, by installing the relevant extensions/plugins you will be able to connect it to your Dataiku DSS instance and edit source code directly from there.
If you prefer editing your source code remotely, Dataiku DSS offers the possibility to embed a Visual Studio Code editor directly in its interface. This option is based on the platform’s “Code Studios” feature and does not require any setup on your client machine since it is fully managed by the platform.
VSCode/IntelliJ extension for Dataiku DSS
The Visual Studio marketplace page to install and configure the extension
PyCharm plugin for Dataiku DSS
The JetBrains marketplace page to install and configure the plugin
Writing code often implies working with third-party packages that you need to install separately. For example, in the case of Python you would take advantage of virtual environments to create and import your dependencies.
In Dataiku DSS, the equivalent of the virtual environment concept is called “code environment”, it allow you to choose which Python version and which custom packages you want to run your code with. Once the code environment is set up, its dependencies can be imported from any piece of code run by Dataiku DSS.
Bringing an external code base#
As a new Dataiku DSS user, you probably have already worked on an existing code base living independently from the instance. You can make the items of this code base directly importable in Dataiku DSS by using a special feature of project libraries called “Git references”. Provided that the external code based is hosted on a remote Git repository, this feature allows you to pull a specific branch of that repository in Dataiku DSS, which will be materialized into a project library.
By doing so, you can have your Dataiku DSS workflows operate hand-in-hand with any external code repository.