Releases · lensesio/stream-reactor · GitHub

04 Dec 23:11

Stream Reactor 8.1.20 Latest

Latest

NullPointerException information lost (#174)

The change brings extra logging to identify where the exception occurs. It changes the error log information.

Co-authored-by: stheppi <[email protected]>
Co-authored-by: Andrew Stevenson <[email protected]>

Assets 19

kafka-connect-aws-s3-8.1.20.zip

112 MB 2024-12-05T09:51:42Z
kafka-connect-azure-datalake-8.1.20.zip

120 MB 2024-12-05T09:52:25Z
kafka-connect-azure-documentdb-8.1.20.zip

90.2 MB 2024-12-05T09:52:03Z
kafka-connect-azure-eventhubs-8.1.20.zip

29.6 MB 2024-12-04T23:11:06Z
kafka-connect-azure-servicebus-8.1.20.zip

38.4 MB 2024-12-04T23:11:32Z
kafka-connect-cassandra-8.1.20.zip

91.4 MB 2024-12-05T09:52:51Z
kafka-connect-elastic6-8.1.20.zip

110 MB 2024-12-05T09:53:22Z
kafka-connect-elastic7-8.1.20.zip

116 MB 2024-12-05T09:53:47Z
kafka-connect-ftp-8.1.20.zip

64.3 MB 2024-12-05T09:54:10Z
kafka-connect-gcp-pubsub-8.1.20.zip

55 MB 2024-12-04T23:11:51Z
Source code (zip)

2024-12-04T22:42:53Z
Source code (tar.gz)

2024-12-04T22:42:53Z

04 Dec 15:45

Stream Reactor 8.1.19

Fix: Prevent ElasticSearch from Skipping Records After Tombstone (#172)

* Fix: Prevent ElasticSearch from Skipping Records After Tombstone
Overview
This pull request addresses a critical bug in ElasticSearch versions 6 (ES6) and 7 (ES7) where records following a tombstone are inadvertently skipped during the insertion process. The issue stemmed from an erroneous return statement that halted the processing of subsequent records.

Background
In the current implementation, when a tombstone record is encountered within a sequence of records to be written to ElasticSearch, the insertion process prematurely exits due to a return instruction. This results in all records following the tombstone being ignored, leading to incomplete data ingestion and potential inconsistencies within the ElasticSearch indices.

Changes Made
Refactored Insert Method:

Modularization: The original insert method has been decomposed into smaller, more focused functions. This enhances code readability, maintainability, and facilitates easier testing.

Detailed Log Entries: Added  log statements at key points within the insertion workflow

ES Error not handled: Previously the response from ElasticSearch ignored failures. With this change, if any of the batch fail, the sink will raise an exception.

* Avoid sending empty requests

* Fix the unit tests

---------

Co-authored-by: stheppi <[email protected]>

Assets 19

29 Nov 15:23

Stream Reactor 8.1.18

🚀 New Features

Sequential Message Sending for Azure Service Bus

Introduced a new KCQL property: batch.enabled (default: true).
Users can now disable batching to send messages sequentially, addressing specific scenarios with large message sizes (e.g., >1 MB).
Why this matters: Batching improves performance but can fail for large messages. Sequential sending ensures reliability in such cases.
How to use: Configure batch.enabled=false in the KCQL mapping to enable sequential sending.

Post-Processing for Datalake Cloud Source Connectors

Added post-processing capabilities for AWS S3 and GCP Storage source connectors ( Azure Datalake Gen 2 support coming soon).
New KCQL properties:
- post.process.action: Defines the action (DELETE or MOVE) to perform on source files after successful processing.
- post.process.action.bucket: Specifies the target bucket for the MOVE action (required for MOVE).
- post.process.action.prefix: Specifies a new prefix for the file’s location when using the MOVE action (required for MOVE).
Use cases:
- Free up storage space by deleting files.
- Archive or organize processed files by moving them to a new location.
Example 1 : Delete Files:

INSERT INTO `my-bucket`
SELECT * FROM `my-topic`
PROPERTIES ('post.process.action'=`DELETE`)

Example 2: Move files to an archive bucket:

INSERT INTO `my-bucket:archive/`
SELECT * FROM `my-topic`
PROPERTIES (
    'post.process.action'=`MOVE`,
    'post.process.action.bucket'=`archive-bucket`,
    'post.process.action.prefix'=`archive/`
)

🛠 Dependency Updates

Updated Azure Service Bus Dependencies

azure-core updated to version 1.54.1.
azure-messaging-servicebus updated to version 7.17.6.

These updates ensure compatibility with the latest Azure SDKs and improve stability and performance.

Upgrade Notes

Review the new KCQL properties and configurations for Azure Service Bus and Datalake connectors.
Ensure compatibility with the updated Azure Service Bus dependencies if you use custom extensions or integrations.

Thank you to all contributors! 🎉

Assets 19

25 Nov 21:56

Stream Reactor 8.1.17

Feat/es suport pk from key (#162)

* ElasticSearch Document Primary Key

The ES sink connector misses the feature of choosing the key from the Key or Header. No SMT would help move data from the Key into the Value payload so that the connector can work in the scenarios where the Key or a Header carries information to be used as part of the ElasticSearch document primary key.

The change refines the TransformAndExtractPK to take the Key and Headers. It adds tests that were missing for PrimaryKeyExtractor, JsonPayloadExtractor and TransformAndExtractPK

* Improve the code complexity

Co-authored-by: David Sloan <[email protected]>

* Improve the test for json payload to mix ing OptionValues and reduce the code required

Make the _key/_value/_header a constant.

* Avoid deseralising the key a json if there is not _key path in the primary keys list

* Enhances the functionality of PK path extraction by allowing the path to be specified as _key or nested paths like _key.fieldA.fieldB. This change broadens the scope of supported incoming types, ensuring compatibility with all Kafka Connect Struct types, as well as schemaless input. It provides more flexibility and robustness in handling diverse data formats for primary key extraction.

* Fix the unit tests and the handling of bytes/string

* Remove unused import

---------

Co-authored-by: stheppi <[email protected]>
Co-authored-by: David Sloan <[email protected]>

Assets 19

12 Nov 15:20

Stream Reactor 8.1.16

Improvements to the HTTP Sink (#163)

* Improvements to the HTTP Sink

This PR introduces the following enhancements to the HTTP Sink:

1. **Queue Limiting**: We've set a limit on the queue size per topic to reduce the chances of an Out-of-Memory (OOM) issue. Previously the queue was unbounded and in a scenario where the http calls are slow and the sink gets more records than it clears, it would eventually lead to OOM.

2. **Offering Timeout**: The offering to the queue now includes a timeout. If there are records to be offered, but the timeout is exceeded, a retriable exception is thrown. Depending on the connector's retry settings, the operation will be attempted again. This helps avoid situations where the sink gets stuck processing a slow or unresponsive batch.

3. **Duplicate Record Handling**: To prevent the same records from being added to the queue multiple times, we've introduced a `Map[TopicPartition, Offset]` to track the last processed offset for each topic-partition. This ensures that the sink does not attempt to process the same records repeatedly.

4. **Batch Failure Handling**: The changes also address a situation where an HTTP call fails due to a specific input, but the batch is not removed from the queue. This could have led to the batch being retried indefinitely, which is now prevented.

In the near future, there will be a new PR to further reduce the code complexity around the batching approach and the boilerplate code.

* fix the unit test

* Rename variable

* Removes the invalid functional tests. a failed batch request is not retried anymore.

* Remove unused functions

---------

Co-authored-by: stheppi <[email protected]>

Assets 19

29 Oct 20:32

Stream Reactor 8.1.15

Ignore null records (#156)

* Ignore null records

Sometimes, custom made SMTs overwrite the payload and pass nulls to the connector leading to unintended results.

* Fix the linting error

---------

Co-authored-by: stheppi <[email protected]>

Assets 20

28 Oct 13:26

Stream Reactor 8.1.14

HTTP Sink Reporter Fixes

Assets 19

25 Oct 11:28

Stream Reactor 8.1.13

HTTP Sink Improvement

Assets 20

24 Oct 17:52

Stream Reactor 8.1.12

Data lake sink improvements. HTTP sink batching fix.

Assets 19

17 Oct 13:26

Stream Reactor 8.1.11

HTTP Sink Improvements

Assets 20