API for plugin formats#

class dataiku.customformat.Formatter(config, plugin_config)#

Custom formatter

get_output_formatter(stream, schema)#

Return a OutputFormatter for this format

Parameters:
  • stream – the stream to write the formatted data to

  • schema – the schema of the rows that will be formatted (never None)

get_format_extractor(stream, schema=None)#

Return a FormatExtractor for this format

Parameters:
  • stream – the stream to read the formatted data from

  • schema – the schema of the rows that will be extracted. None when the extractor is used to detect the format.

class dataiku.customformat.OutputFormatter(stream)#

Writes a stream of rows to a stream in a format. The calls will be:

  • write_header()

  • write_row(row_1) …

  • write_row(row_N)

  • write_footer()

write_header()#

Write the header of the format (if any)

write_row(row)#

Write a row in the format

Parameters:

row – array of strings, with one value per column in the schema

Write the footer of the format (if any)

class dataiku.customformat.FormatExtractor(stream)#

Reads a stream in a format to a stream of rows

read_schema()#

Get the schema of the data in the stream, if the schema can be known upfront.

Returns:

the list of columns as [{‘name’:’col1’, ‘type’:’col1type’},…]

read_row()#

Read one row from the formatted stream

Returns:

a dict of the data (name, value), or None if reading is finished