Add option to use auto-generated IDs on indexing #679
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
While setting the document ID when indexing does provide exactly once delivery, it does put more load on Elasticsearch and is not necessary for all use cases.
PRs have been made for this issue before (#393) and (#510). This PR is largely an update to the most recent one, as again there were many merge conflicts that needed resolving there as it fell out of date.
Addresses #139 and #97
Solution
Add a new option to use the autogenerated document id on index requests. The new option (use.autogenerated.ids) will default to false and only be applicable when write.method is set to INSERT.
Note that the large diff in the DataCoverter class on the convertRecord method is a result of having to pull a chunk of that code out into a separate method. The checkstyle plugin was throwing errors when an extra statement was added in that the cyclomatic complexity got too high.
Does this solution apply anywhere else?
If yes, where?
Test Strategy
Testing done:
As with (#510), we are running live connectors leveraging this feature.
Release Plan