-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HELLO!!] Adds routing support, configurable via 'routing.field.name' #558
base: master
Are you sure you want to change the base?
[HELLO!!] Adds routing support, configurable via 'routing.field.name' #558
Conversation
@confluentinc It looks like @hartmut-co-uk just signed our Contributor License Agreement. 👍 Always at your service, clabot |
I noticed for parent-join use case - in addition to adding the routing, the payload also potentially needs to be enriched with the
I wonder if it would be worth to also natively build into this connector, instead of forcing the user to enrich data upfront, or build+require a custom SMT. Note: I tried with InsertField SMT - but since it only supports flat fields it's impossible to enrich the struct for children. So one solution I could see is to add following config options:
|
A custom SMT might be the best fit to create the ‘join-field struct’ for ES parent/child relationship on the fly. |
Further room for improvement: allow to use data from |
I've implemented a simple SMT (also works just fine having it embedded in the connector jar). sticking with the ES example it's configured like:
(with the actual msg value having a field Please provide some feedback on how to proceed further. |
Amazing work @hartmut-co-uk 😍 This is going to help us a lot! |
@levzem could you please provide initial feedback? Also
|
I'm still keen to help wrap this up, write more tests, .. if we can agree how to proceed. |
@yanglei99 @kkonstantine @levzem @dosvath anyone? We so desperately need this and the work is done but it's dead in the water for ~2 months now without any apparent reason 😕 |
ping |
# Conflicts: # src/test/java/io/confluent/connect/elasticsearch/integration/ElasticsearchConnectorIT.java
Went on a sabbatical for 3 months and still 😂 I thought for a moment that this project was abandoned all together but there's still stuff merged few days ago. @kkonstantine 🙏 🙏 🙏 what needs to happen to get some eyes on this? Over half a year open now... |
I don't think it's been abandoned in general. I'm currently busy with other things. Still happy to contribute if someone confirms that the feature/changes are welcome and will be merged + how to proceed with adding tests / SMT. |
Any plans to merge this PR? |
Dear maintainers, is this feature/PR welcome, and are further changes required? I'm still keen to help wrap this up and write tests, ... if we can agree on how to proceed. |
... |
If you're still searching for a platform-type of solution, take a look at Apache Flink. You get pretty much access to the bare-bones Elasticsearch API so you can do whatever you want, including routing, timeouts per bulk action, etc. Plus an actual Web UI 🤩 |
Use flink kafka source -> flink elasticsearch sink -- instead of kafka connect? 🤔 Many thanks for bringing this up, given I've been considering Apache Flink for other use cases in my project, too... |
|
I am no expert but I think this approach can be difficult, routing based upon a value in a record is tricky when it comes to deleting. Typically deleting is triggered by a null record. We noticed when that is the case the delete request does not route the delete to the correct shard because the routing key is missing from the record. We had to take a different approach to get this to work. |
Refs:
Problem
ElasticSearch
_routing
field currently is not supported.Solution
Does this solution apply anywhere else?
If yes, where?
Parent/Child relationships -> join field type
Test Strategy
DataConverterTest
has been added.Integration test might require refactoring of existing tests.
ElasticsearchConnectorIT
/ElasticsearchConnectorBaseIT
seems to use a very basic schema / payload.Also
ElasticsearchHelperClient
would need to be improved to also support _routing.Please advise how to approach.
Testing done:
TODOs / open topics
Release Plan