-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert to Elasticsearch streaming_bulk helper #2492
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should update the PR description (e.g., remove the references to it being a draft), and the branch needs a rebase...but, this is great stuff!
Of course, I have the usual pile of comments. The principal theme running through them is that this code is evidently evolutionary: there are vestiges of the original code in the new code, and there are vestiges of the subclass in the superclass; I've pointed out the ones that I found. Most of my concerns are centered around documentation and comments, particularly around the json_data
parameter, which turns out to be a bit of a swiss-army knife. (It's really more like a "context" parameter.)
So, really, it just needs a little polish and then it will be good to go!
lib/pbench/test/unit/server/query_apis/test_datasets_publish.py
Outdated
Show resolved
Hide resolved
lib/pbench/test/unit/server/query_apis/test_datasets_publish.py
Outdated
Show resolved
Hide resolved
lib/pbench/test/unit/server/query_apis/test_datasets_publish.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it looks good, the details of the bulk handling is being repeated from py-es-bulk
. By the time we are done here, there won't be much difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good: you still need to update the PR description; otherwise, I have just a couple of nits for you.
Simplify the pre-update info log to speed things up and avoid an unnecessary pass through the MAP; the document count will appear in the post-update message anyway. Add a clarification that the bulk update operation only changes the one specified field.
Elasticsearch bulk "action" rather than assuming "update" in the base class.
@webbnh still has an unresolved conversation. |
This creates a new
ElasticBulkBase
class providing support for Elasticsearch bulk operations including publish and delete, and which currently shares the query_apis__init__.py
with theElasticBase
class.Resolves #2490