Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

combine write and update to one request #6

Merged
merged 1 commit into from
Jul 15, 2022

Conversation

ckiosidis
Copy link

@ckiosidis ckiosidis commented Jul 14, 2022

Signed-off-by: Babis Kiosidis [email protected]

Context

We are noticing about 10% errs when talking to gcs, with 404 and 429 (high rate of updates) error codes.

From Google quotas docs

There is a write limit to the same object name. This limit is once per second, so writes to the same object name won't scale. For more information, see [Object immutability](https://cloud.google.com/storage/docs/key-terms#immutability).

Currently it seems that stow will create and update a gcs file when it can be done on a single operation.
The google-go library suggests to set attributes on the Writer and the Attrs() call should contain the necessary metadata according to the comment

What would be the best way to test this change? Any tips?

cc @EngHabu

will contribute this back to graymeta/stow if merged

Comment on lines -120 to -122
w.Close()

attr, err := obj.Update(c.ctx, storage.ObjectAttrsToUpdate{Metadata: mdPrepped})
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the errors we are seeing on some RPCs are because we are not checking err = w.Close()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great that you added it!

@@ -114,17 +114,25 @@ func (c *Container) Put(name string, r io.Reader, size int64, metadata map[strin
}

w := obj.NewWriter(c.ctx)
w.ObjectAttrs.Metadata = merge(w.ObjectAttrs.Metadata, mdPrepped)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The obj.Update does checks arround attrs and for empty string for certain fields eg :
https://github.com/googleapis/google-cloud-go/blob/5a8c607e3a9a3265887e27cb13f8943f3e3fa23d/storage/storage.go#L958

And does workaround by setting rawObj.NullFields = nullFields

Will that cause an issue if we now dont use the Update call

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only set the metadata in this change.
The rest of the ObjectAttrs we leave as is (which is the same as the previous behaviour)

Copy link

@EngHabu EngHabu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thank you for removing the extra call... I think that makes sense


func merge(metadata ...map[string]string) map[string]string {
res := map[string]string{}
for _, mt := range metadata {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Metadata is key/value pairs according to google documentation. Why do we need the nested loop? Shouldn't we just:

for k, v := range metadata {
    res[k] = v
}

?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This merges a list of maps into one... so the first loop iterates over the list of maps, the second loop iterates over the keys/values in each of them..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants