How to design a Sink¶

While the default implementation will create one Sink per stream, this behavior can be customized or overridden in a number of ways:

`1:1` Mapping¶

This is the default, where only one sink is active for each incoming stream name:

The exception to this rule is when a new STATE message is received for an already-active stream. In this case, the existing sink will be marked to be drained and a new sink will be initialized to receive the next incoming records.
In the case that a sink is archived because of a superseding STATE message, all prior version(s) of the stream’s sink are guaranteed to be drained in creation order.

Note: Targets write state payloads to stdout whenever sinks drain. See Target State Output for details on capturing the final state.

Database sink example¶

A database-type target where each stream will land in a dedicated table. Each sink is of the same class, with a different target table based on stream_name.

SaaS sink example¶

A SaaS-type target where each stream will be uploaded to a different REST endpoint based on stream name. Each sink class is specialized based on the requirements of the target API endpoint.

`1:many` Mapping¶

In this scenario, the target intentionally creates multiple sinks per stream:

The developer may override Target.get_sink() and use details within the record (or a randomization algorithm) to send records to multiple sinks all corresponding to the same stream.

Data lake ingestion example¶

A 1:many relationship may be used for a data lake target where output files should be pre-partitioned according to one or more attributes within the record. Importing multiple smaller files, named according to their partition key values are more efficient than loading fewer larger files.

`many:1` Mapping¶

In this scenario, the target intentionally sends all records to the same sink regardless of stream name:

The stream name will likely be made an attribute of the final output, but records do not need to be segregated by the stream name.

JSON file writer example¶

A json file writer where the desired output is a single combined json file with all records from all streams.