Databricks Connect#

For usage information and examples, see Databricks Connect

class dataiku.dbconnect.DkuDBConnect#
create_session(connection_name, project_key=None)#

Creates a new session configured to read on the supplied DSS connection.

get_dataframe(dataset, session=None)#

Return a DataFrame configured to read the table that is underlying the specified dataset.

get_session(connection_name, project_key=None)#

Return session configured to read on the supplied DSS connection.

write_dataframe(dataset, df, infer_schema=False, force_direct_write=False, dropAndCreate=False)#

Writes this dataset (or its target partition, if applicable) from a single dataframe.

This variant only edit the schema if infer_schema is True, otherwise you must take care to only write dataframes that have a compatible schema. Also see “write_with_schema”.

Parameters:
  • df – input dataframe.

  • dataset – Output dataset to write.

  • infer_schema – infer the schema from the dataframe.

  • force_direct_write – Force writing the dataframe using the direct API into the dataset even if they don’t come from the same DSS connection.

  • dropAndCreate – if infer_schema and this parameter are both set to True, clear and recreate the dataset structure.

write_with_schema(dataset, df, force_direct_write=False, dropAndCreate=False)#

Writes this dataset (or its target partition, if applicable) from a single dataframe.

This variant replaces the schema of the output dataset with the schema of the dataframe.

Parameters:
  • df – input dataframe.

  • dataset – Output dataset to write.

  • force_direct_write – Force writing the dataframe using the direct API into the dataset even if they don’t come from the same DSS connection.

  • dropAndCreate – drop and recreate the dataset.