singer_sdk.BatchSink

class singer_sdk.BatchSink[source]

Base class for batched record writers.

abstract process_batch(context)[source]

Process a batch with the given batch context.

This method must be overridden.

If process_record() is not overridden, the context[“records”] list will contain all records from the given batch context.

If duplicates are merged, these can be tracked via tally_duplicate_merged().

Parameters:

context (dict) – Stream partition or context dictionary.

Return type:

None

process_record(record, context)[source]

Load the latest record from the stream.

Developers may either load to the context dict for staging (the default behavior for Batch types), or permanently write out to the target.

If this method is not overridden, the default implementation will create a context[“records”] list and append all records for processing during process_batch().

If duplicates are merged, these can be tracked via tally_duplicate_merged().

Parameters:
  • record (dict) – Individual record in the stream.

  • context (dict) – Stream partition or context dictionary.

Return type:

None

start_batch(context)[source]

Start a new batch with the given context.

The SDK-generated context will contain batch_id (GUID string) and batch_start_time (datetime).

Developers may optionally override this method to add custom markers to the context dict and/or to initialize batch resources - such as initializing a local temp file to hold batch records before uploading.

Parameters:

context (dict) – Stream partition or context dictionary.

Return type:

None