Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the option for configuring the batch size. #139

Merged

Conversation

mansoor-sajjad
Copy link
Contributor

@mansoor-sajjad mansoor-sajjad commented Nov 15, 2022

👋
We are using pelias-csv-importer to import the huge dataset into elasticsearch
We have experienced that the performance is not optimal with the current batch size, given our resources.


Here's the reason for this change 🚀

In the process of improving the performance of import, we have experienced that there is no way to configure the batch size when making the bulk import request to elasticsearch via pelias-dbclient.
The pelias-csv-importer is using the pelias-dbclient which has the batch size of 500 hardcoded in the Batch.js.
It would be nice to make it configurable.


Here's what actually got changed 👏

Added the batchSize configuration option under dbclient configuration, and defaults it to 500 as it is in the pelias-dbclient now.


After this PR get merged

I have created the PR#125 on pelias-dbclient to use this configuration option. I need to update that PR with the latest version of pelias-config.
I also need to make a PR on the pelias-csv-importer to update the versions of pelias-config and pelias-dbclient to use this configuration option.

@mansoor-sajjad
Copy link
Contributor Author

Hi @missinglink,

Do you have the opportunity to review this PR, it's been there for a while?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants