Bulk mutator fixes #1880

dopiera · 2019-01-28T02:17:05Z

This addresses two issues in BulkMutator:

-1 as index in mutations which have never been confirmed by the service
OK status returned for such mutations.

Whenever a mutation was not confirmed by the service (e.g. because the stream was broken every retry), it was returned as failed (correct) with -1 as original index (incorrect).

In such cases OK was returned, which was very confusing - it broke the invariant that every failure has a non-zero exit code, which I believe users would assume.

This change is

Whenever a mutation was not confirmed by the server (e.g. because the stream was broken every retry), it was returned as failed (correct) with -1 as original index (incorrect). This patch fixes it.

If BulkMutator never gets a response for a mutation it used to return OK, which was very confusing - it broke the invariant that every failure has a non-zero exit code, which I believe users would assume.

dopiera · 2019-01-28T02:38:19Z

This is required for #1881.

codecov · 2019-01-28T03:01:59Z

Codecov Report

Merging #1880 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1880      +/-   ##
==========================================
- Coverage   92.88%   92.85%   -0.03%     
==========================================
  Files         297      298       +1     
  Lines       16790    16884      +94     
==========================================
+ Hits        15595    15678      +83     
- Misses       1195     1206      +11

Impacted Files	Coverage Δ
google/cloud/bigtable/mutations.h	`100% <ø> (ø)`	⬆️
google/cloud/bigtable/internal/bulk_mutator.cc	`98.71% <100%> (+0.05%)`	⬆️
google/cloud/bigtable/table.cc	`95.74% <0%> (-4.26%)`	⬇️
google/cloud/storage/internal/curl_client.cc	`95.92% <0%> (-0.12%)`	⬇️
google/cloud/storage/internal/bucket_requests.cc	`97.34% <0%> (-0.04%)`	⬇️
google/cloud/storage/list_buckets_reader.h	`95.23% <0%> (ø)`	⬆️
google/cloud/storage/list_objects_reader.h	`95.45% <0%> (ø)`	⬆️
google/cloud/storage/object_access_control.cc	`100% <0%> (ø)`	⬆️
google/cloud/storage/object_access_control.h	`100% <0%> (ø)`	⬆️
google/cloud/bigtable/table.h	`100% <0%> (ø)`	⬆️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f4ae11...e70aa25. Read the comment docs.

devjgm

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @coryan and @dopiera)

google/cloud/bigtable/internal/bulk_mutator.cc, line 165 at r1 (raw file):

  int idx = 0;
  for (auto& mutation : *pending_mutations_.mutable_entries()) {
    auto &annotation = pending_annotations_[idx++];

I'm new to the bigtable code here, so please forgive my newbie question. But it likes there is an implicit assumption/guarantee that pending_mutations_ and pending_annotations_ are the same size. That is they are "parallel arrays" (or containers) of some objects, where the items at index i are related.

If that's the case, should we consider making this implicit assumption an explicit requirement in the code? To be specific, maybe we should consider having the BulkMutator class replace both the pending_mutations_ and pending_annotations collections with a single data member like:

struct MutationAnnotation {
  google::bigtable::v2::Mutation mutation;
  Annotations annotation;
};
std::vector<MutationAnnotation> pending_;

I wonder if that would result in clearer code that is also easier to use and iterate. For example, here you wouldn't need to keep a separate index counter; you would only need a simple range-for loop.

Thoughts?

dopiera

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @coryan and @devjgm)

google/cloud/bigtable/internal/bulk_mutator.cc, line 165 at r1 (raw file):

Previously, devjgm (Greg Miller) wrote…

I'm new to the bigtable code here, so please forgive my newbie question. But it likes there is an implicit assumption/guarantee that pending_mutations_ and pending_annotations_ are the same size. That is they are "parallel arrays" (or containers) of some objects, where the items at index i are related.

If that's the case, should we consider making this implicit assumption an explicit requirement in the code? To be specific, maybe we should consider having the BulkMutator class replace both the pending_mutations_ and pending_annotations collections with a single data member like:
struct MutationAnnotation {
  google::bigtable::v2::Mutation mutation;
  Annotations annotation;
};
std::vector<MutationAnnotation> pending_;
I wonder if that would result in clearer code that is also easier to use and iterate. For example, here you wouldn't need to keep a separate index counter; you would only need a simple range-for loop.

Thoughts?

Hi Greg,
I think you're right. How about I do what you're suggesting in another PR, though, not to mix the bugfix with refactoring?

dopiera

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @coryan and @devjgm)

google/cloud/bigtable/internal/bulk_mutator.cc, line 165 at r1 (raw file):

Previously, dopiera (Marek Dopiera) wrote…

Hi Greg,
I think you're right. How about I do what you're suggesting in another PR, though, not to mix the bugfix with refactoring?

Actually, after I started writing the refactor, I realized that pending_mutations_ is a single proto. In order to achieve what you're suggesting, we'd have to keep the proto's entries in a vector and serialize them right before sending. I think that would actually add complexity and cost, so we shouldn't do it. Does this make sense?

devjgm · 2019-01-28T16:35:41Z

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @coryan and @devjgm)

google/cloud/bigtable/internal/bulk_mutator.cc, line 165 at r1 (raw file):

Previously, dopiera (Marek Dopiera) wrote…
Actually, after I started writing the refactor, I realized that pending_mutations_ is a single proto. In order to achieve what you're suggesting, we'd have to keep the proto's entries in a vector and serialize them right before sending. I think that would actually add complexity and cost, so we shouldn't do it. Does this make sense?

I see. OK. Thanks for explaining.

dopiera · 2019-01-28T16:37:12Z

Did you close it on purpose? I think we need this fix.

devjgm · 2019-01-28T16:52:48Z

D'oh! sorry.

google/cloud/bigtable/internal/bulk_mutator.cc

devjgm · 2019-01-29T14:26:24Z

google/cloud/bigtable/internal/bulk_mutator.cc

  for (auto& mutation : *pending_mutations_.mutable_entries()) {
-    result.emplace_back(
-        FailedMutation(SingleRowMutation(std::move(mutation)), ok_status));
+    auto &annotation = pending_annotations_[idx++];


nit: Please place the & adjacent to the auto, not the variable name. per the style guide.

I removed the annotation variable altogether.

coryan · 2019-01-30T14:57:58Z

LGTM, please let's wait until Kokoro has a chance to run over this.

codecov · 2019-01-30T15:28:14Z

Codecov Report

Merging #1880 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1880      +/-   ##
==========================================
- Coverage   92.88%   92.85%   -0.03%     
==========================================
  Files         297      298       +1     
  Lines       16790    16884      +94     
==========================================
+ Hits        15595    15678      +83     
- Misses       1195     1206      +11

Impacted Files	Coverage Δ
google/cloud/bigtable/mutations.h	`100% <ø> (ø)`	⬆️
google/cloud/bigtable/internal/bulk_mutator.cc	`98.71% <100%> (+0.05%)`	⬆️
google/cloud/bigtable/table.cc	`95.74% <0%> (-4.26%)`	⬇️
google/cloud/storage/internal/curl_client.cc	`95.92% <0%> (-0.12%)`	⬇️
google/cloud/storage/internal/bucket_requests.cc	`97.34% <0%> (-0.04%)`	⬇️
google/cloud/storage/list_buckets_reader.h	`95.23% <0%> (ø)`	⬆️
google/cloud/storage/list_objects_reader.h	`95.45% <0%> (ø)`	⬆️
google/cloud/storage/object_access_control.cc	`100% <0%> (ø)`	⬆️
google/cloud/storage/object_access_control.h	`100% <0%> (ø)`	⬆️
google/cloud/bigtable/table.h	`100% <0%> (ø)`	⬆️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f4ae11...e70aa25. Read the comment docs.

dopiera added 2 commits January 28, 2019 03:13

Fix BulkMutator returning invalid indices.

d8589b2

Whenever a mutation was not confirmed by the server (e.g. because the stream was broken every retry), it was returned as failed (correct) with -1 as original index (incorrect). This patch fixes it.

BulkMutator returns UNKNOWN if unconfirmed.

7c6fc11

If BulkMutator never gets a response for a mutation it used to return OK, which was very confusing - it broke the invariant that every failure has a non-zero exit code, which I believe users would assume.

googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jan 28, 2019

dopiera added the kokoro:run Add this label to force Kokoro to re-run the tests. label Jan 28, 2019

kokoro-team removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Jan 28, 2019

dopiera mentioned this pull request Jan 28, 2019

Add streaming to (Async)BulkMutator. #1881

Merged

dopiera requested a review from coryan January 28, 2019 03:00

devjgm reviewed Jan 28, 2019

View reviewed changes

dopiera commented Jan 28, 2019

View reviewed changes

coryan added the api: bigtable Issues related to the Bigtable API. label Jan 28, 2019

devjgm closed this Jan 28, 2019

devjgm reopened this Jan 28, 2019

ghost reviewed Jan 29, 2019

View reviewed changes

google/cloud/bigtable/internal/bulk_mutator.cc Outdated Show resolved Hide resolved

Apply review comment.

e70aa25

dopiera added the kokoro:run Add this label to force Kokoro to re-run the tests. label Jan 29, 2019

coryan removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Jan 29, 2019

devjgm reviewed Jan 29, 2019

View reviewed changes

coryan approved these changes Jan 30, 2019

View reviewed changes

coryan added the kokoro:run Add this label to force Kokoro to re-run the tests. label Jan 30, 2019

kokoro-team removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Jan 30, 2019

dopiera merged commit efaed61 into googleapis:master Jan 30, 2019

dopiera deleted the bulk_mutator_fixes branch January 30, 2019 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bulk mutator fixes #1880

Bulk mutator fixes #1880

dopiera commented Jan 28, 2019 •

edited by coryan

Loading

dopiera commented Jan 28, 2019

codecov bot commented Jan 28, 2019 •

edited

Loading

devjgm left a comment

dopiera left a comment

dopiera left a comment

devjgm commented Jan 28, 2019

dopiera commented Jan 28, 2019

devjgm commented Jan 28, 2019

devjgm Jan 29, 2019

dopiera Jan 29, 2019

coryan commented Jan 30, 2019

codecov bot commented Jan 30, 2019

Bulk mutator fixes #1880

Bulk mutator fixes #1880

Conversation

dopiera commented Jan 28, 2019 • edited by coryan Loading

dopiera commented Jan 28, 2019

codecov bot commented Jan 28, 2019 • edited Loading

Codecov Report

devjgm left a comment

Choose a reason for hiding this comment

dopiera left a comment

Choose a reason for hiding this comment

dopiera left a comment

Choose a reason for hiding this comment

devjgm commented Jan 28, 2019

dopiera commented Jan 28, 2019

devjgm commented Jan 28, 2019

devjgm Jan 29, 2019

Choose a reason for hiding this comment

dopiera Jan 29, 2019

Choose a reason for hiding this comment

coryan commented Jan 30, 2019

codecov bot commented Jan 30, 2019

Codecov Report

dopiera commented Jan 28, 2019 •

edited by coryan

Loading

codecov bot commented Jan 28, 2019 •

edited

Loading