API for plugin formats#
- class dataiku.customformat.Formatter(config, plugin_config)#
Custom formatter
- get_output_formatter(stream, schema)#
Return a OutputFormatter for this format
- Parameters:
stream – the stream to write the formatted data to
schema – the schema of the rows that will be formatted (never None)
- get_format_extractor(stream, schema=None)#
Return a FormatExtractor for this format
- Parameters:
stream – the stream to read the formatted data from
schema – the schema of the rows that will be extracted. None when the extractor is used to detect the format.
- class dataiku.customformat.OutputFormatter(stream)#
Writes a stream of rows to a stream in a format. The calls will be:
write_header()
write_row(row_1) …
write_row(row_N)
write_footer()
- write_header()#
Write the header of the format (if any)
- write_row(row)#
Write a row in the format
- Parameters:
row – array of strings, with one value per column in the schema
Write the footer of the format (if any)
- class dataiku.customformat.FormatExtractor(stream)#
Reads a stream in a format to a stream of rows
- read_schema()#
Get the schema of the data in the stream, if the schema can be known upfront.
- Returns:
the list of columns as [{‘name’:’col1’, ‘type’:’col1type’},…]
- read_row()#
Read one row from the formatted stream
- Returns:
a dict of the data (name, value), or None if reading is finished