-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry and prevent cleanup of local blocks for upto 5 iterations #3894
Changes from 5 commits
e347b23
1f1f19d
4640ba4
bb3f875
9150823
fcb6293
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -99,6 +99,12 @@ func Download(ctx context.Context, logger log.Logger, bucket objstore.Bucket, id | |||||
// TODO(bplotka): Ensure bucket operations have reasonable backoff retries. | ||||||
// NOTE: Upload updates `meta.Thanos.File` section. | ||||||
func Upload(ctx context.Context, logger log.Logger, bkt objstore.Bucket, bdir string, hf metadata.HashFunc) error { | ||||||
return UploadWithRetry(ctx, logger, bkt, bdir, hf, 0) | ||||||
} | ||||||
|
||||||
// UploadWithRetry is a utility function for upload and acts as a workaround for absence of default parameters (which in this case is retryCounter = 0). | ||||||
func UploadWithRetry(ctx context.Context, logger log.Logger, bkt objstore.Bucket, bdir string, hf metadata.HashFunc, retryCounter int) error { | ||||||
var flag bool = false | ||||||
Biswajitghosh98 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
df, err := os.Stat(bdir) | ||||||
if err != nil { | ||||||
return err | ||||||
|
@@ -120,29 +126,40 @@ func Upload(ctx context.Context, logger log.Logger, bkt objstore.Bucket, bdir st | |||||
} | ||||||
|
||||||
if meta.Thanos.Labels == nil || len(meta.Thanos.Labels) == 0 { | ||||||
return errors.New("empty external labels are not allowed for Thanos block.") | ||||||
return errors.New("empty external labels are not allowed for Thanos block") | ||||||
} | ||||||
|
||||||
meta.Thanos.Files, err = gatherFileStats(bdir, hf, logger) | ||||||
if err != nil { | ||||||
return errors.Wrap(err, "gather meta file stats") | ||||||
} | ||||||
|
||||||
metaEncoded := strings.Builder{} | ||||||
if err := meta.Write(&metaEncoded); err != nil { | ||||||
return errors.Wrap(err, "encode meta file") | ||||||
} | ||||||
|
||||||
if err := bkt.Upload(ctx, path.Join(DebugMetas, fmt.Sprintf("%s.json", id)), strings.NewReader(metaEncoded.String())); err != nil { | ||||||
return cleanUp(logger, bkt, id, errors.Wrap(err, "upload debug meta file")) | ||||||
if retryCounter == 5 { | ||||||
return cleanUp(logger, bkt, id, errors.Wrap(err, "upload debug meta file")) | ||||||
} | ||||||
flag = true | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we really need this flag? We can simply do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My first commit was exactly that, after which i shifted to this Line 173 in fcb6293 Line 161 in fcb6293
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rather than using recursive calls for the same, you could create a loop in which you retry for So something like this: for i := 0; i < maxRetryCount; i++ {
if err:= bkt.Upload(...); err == nil {
break // bucket upload was successful
}
log.Debug('retrying again...')
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that makes sense, and instead of the for loop, I'll also go for exponential backoff for each upload There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why we cannot leverage/improve e.g https://github.com/thanos-io/thanos/blob/main/pkg/runutil/runutil.go#L86 |
||||||
} | ||||||
|
||||||
if err := objstore.UploadDir(ctx, logger, bkt, path.Join(bdir, ChunksDirname), path.Join(id.String(), ChunksDirname)); err != nil { | ||||||
return cleanUp(logger, bkt, id, errors.Wrap(err, "upload chunks")) | ||||||
if retryCounter == 5 { | ||||||
return cleanUp(logger, bkt, id, errors.Wrap(err, "upload chunks")) | ||||||
} | ||||||
flag = true | ||||||
} | ||||||
|
||||||
if err := objstore.UploadFile(ctx, logger, bkt, path.Join(bdir, IndexFilename), path.Join(id.String(), IndexFilename)); err != nil { | ||||||
return cleanUp(logger, bkt, id, errors.Wrap(err, "upload index")) | ||||||
if retryCounter == 5 { | ||||||
return cleanUp(logger, bkt, id, errors.Wrap(err, "upload index")) | ||||||
} | ||||||
flag = true | ||||||
} | ||||||
if flag && retryCounter < 5 { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why 5? You can answer the question here, but for anyone visiting this code in the future, it's a mystery number. We need to at least create a const for this, something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it was only a first draft to check if the logic itself works at the first place. I'd definitely go for a user defined retryCounter value if the logic recieves a green flag 💯 |
||||||
return UploadWithRetry(ctx, logger, bkt, bdir, hf, retryCounter+1) | ||||||
} | ||||||
|
||||||
// Meta.json always need to be uploaded as a last item. This will allow to assume block directories without meta file to be pending uploads. | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the name
UploadWithRetry
I will think it runs theUpload
function with retries, but that is not the case here. It would have been better to have an functionUpload(...)
which handles the upload logic without retry, then aUploadWithRetries
function that retries thisUpload
fn itself.But we are only trying to retry when upload fails, so instead of having retries here, doesn't it makes more sense to have a
objstore.UploadDirWithRetries
that will handle this transparently.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is doing exactly that. Isn't it ? @prmsrswt 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with a separate function would be : we'd have to carry over the error which caused a failure, and keeping track of which error caused it to fail would become difficult otherwise. The only reason I made the UploadWithRetries function was to keep the integrity of Upload intact and not add any other parameter to it.
It would have been really great if Golang supported default params 😁