Kafka Sink rewrite without fs2-kafka #100

istreeter · 2024-12-11T14:42:37Z

Previously we were using the fs2-kafka wrapper around the KafkaProducer when sending events to Kafka. The fs2-kafka wrapper executes every send on the CE3 blocking thread. We found in the Snowplow collector that this implementation can be problematic because under some blocking scenarios it causes the CE3 blocking thread pool to create a very large number of threads. The huge number of threads could cause a OOM.

This new implementation still uses the CE3 blocking thread pool, but it calls send many times within the same Sync[F].blocking{...}. This should prevent the problem where very many concurruent calls to Sync[F].blocking triggers the thread pool to grow to one thread per pending event.

Note, this implementation is different to what we chose for the Snowplow collector. For the latter, we used a dedicated single-thread executor for calling send. The difference is because in common-streams we have the luxury of working in batches, whereas the Snowplow collector tends to receive events one-by-one, and thus needs to call send one-by-one.

Previously we were using the fs2-kafka wrapper around the KafkaProducer when sending events to Kafka. The fs2-kafka wrapper executes every `send` on the CE3 blocking thread. We found in the Snowplow collector that this implementation can be problematic because under some blocking scenarios it causes the CE3 blocking thread pool to create a very large number of threads. The huge number of threads could cause a OOM. This new implementation still uses the CE3 blocking thread pool, but it calls `send` many times within the same `Sync[F].blocking{...}`. This should prevent the problem where very many concurruent calls to `Sync[F].blocking` triggers the thread pool to grow to one thread per pending event. Note, this implementation is different to what we chose for the Snowplow collector. For the latter, we used a dedicated single-thread executor for calling `send`. The difference is because in common-streams we have the luxury of working in batches, whereas the Snowplow collector tends to receive events one-by-one, and thus needs to call `send` one-by-one.

benjben · 2024-12-16T09:19:39Z

modules/kafka/src/main/scala/com/snowplowanalytics/snowplow/sinks/kafka/KafkaSink.scala

+      Sync[F].interruptible {
+        val futures = batch.asIterable.map { e =>
+          val record = toProducerRecord(config, e)
+          producer.send(record)


Do I get this right that in case of the buffer with pending records to get sent beig full, this call becomes blocking, which is why we also include it in the Sync[F].interruptible { } ?

It becomes blocking under two circumstances:

The buffer is full, as you say.

The client needs to re-fetch topic metadata from the broker

Case 1 can be avoided by increasing the size of the buffer. But case 2 is unavoidable. So it must be run inside either Sync[F].blocking or Sync[F].interruptible (which are approximately the same thing).

The client needs to re-fetch topic metadata from the broker

I remember a PR where you regularly manually fetch all the metadata in-parallel to avoid this blocking fetch. Do you now think that we should not do that ?

I know the PR you are thinking of, but I abandoned it because I was wrong about it. It is possible to pre-fetch the metadata when the app first starts, to avoid blocking on the first fetch. But it is not possible to avoid blocking when the client decides to periodically re-fetch the metadata.

benjben reviewed Dec 16, 2024

View reviewed changes

benjben approved these changes Dec 17, 2024

View reviewed changes

istreeter merged commit 86f0316 into develop Dec 17, 2024
1 check passed

istreeter deleted the kafka-sink-improvement branch December 17, 2024 08:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka Sink rewrite without fs2-kafka #100

Kafka Sink rewrite without fs2-kafka #100

istreeter commented Dec 11, 2024

benjben Dec 16, 2024

istreeter Dec 16, 2024

benjben Dec 16, 2024

istreeter Dec 16, 2024

Kafka Sink rewrite without fs2-kafka #100

Kafka Sink rewrite without fs2-kafka #100

Conversation

istreeter commented Dec 11, 2024

benjben Dec 16, 2024

Choose a reason for hiding this comment

istreeter Dec 16, 2024

Choose a reason for hiding this comment

benjben Dec 16, 2024

Choose a reason for hiding this comment

istreeter Dec 16, 2024

Choose a reason for hiding this comment