How to design a Sink#

While the default implementation will create one Sink per stream, this behavior can be customized or overridden in a number of ways:

1:1 Mapping#

This is the default, where only one sink is active for each incoming stream name:

  • The exception to this rule is when a new STATE message is received for an already-active stream. In this case, the existing sink will be marked to be drained and a new sink will be initialized to receive the next incoming records.

  • In the case that a sink is archived because of a superseding STATE message, all prior version(s) of the stream’s sink are guaranteed to be drained in creation order.

Database sink example#

A database-type target where each stream will land in a dedicated table. Each sink is of the same class, with a different target table based on stream_name.

SaaS sink example#

A SaaS-type target where each stream will be uploaded to a different REST endpoint based on stream name. Each sink class is specialized based on the requirements of the target API endpoint.

1:many Mapping#

In this scenario, the target intentionally creates multiple sinks per stream:

  • The developer may override Target.get_sink() and use details within the record (or a randomization algorithm) to send records to multiple sinks all corresponding to the same stream.

Data lake ingestion example#

A 1:many relationship may be used for a data lake target where output files should be pre-partitioned according to one or more attributes within the record. Importing multiple smaller files, named according to their partition key values are more efficient than loading fewer larger files.

many:1 Mapping#

In this scenario, the target intentionally sends all records to the same sink regardless of stream name:

  • The stream name will likely be made an attribute of the final output, but records do not need to be segregated by the stream name.

JSON file writer example#

A json file writer where the desired output is a single combined json file with all records from all streams.