Adding the option for configuring the batch size. #139
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
👋
We are using
pelias-csv-importer
to import the huge dataset intoelasticsearch
We have experienced that the performance is not optimal with the current batch size, given our resources.
Here's the reason for this change 🚀
In the process of improving the performance of import, we have experienced that there is no way to configure the batch size when making the bulk import request to elasticsearch via
pelias-dbclient
.The
pelias-csv-importer
is using thepelias-dbclient
which has the batch size of 500 hardcoded in theBatch.js
.It would be nice to make it configurable.
Here's what actually got changed 👏
Added the batchSize configuration option under dbclient configuration, and defaults it to 500 as it is in the
pelias-dbclient
now.After this PR get merged
I have created the PR#125 on
pelias-dbclient
to use this configuration option. I need to update that PR with the latest version ofpelias-config
.I also need to make a PR on the
pelias-csv-importer
to update the versions ofpelias-config
andpelias-dbclient
to use this configuration option.