Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] s3 Blob Store #841

Merged
merged 1 commit into from
Oct 29, 2024
Merged

[v2] s3 Blob Store #841

merged 1 commit into from
Oct 29, 2024

Conversation

ian-shim
Copy link
Contributor

Why are these changes needed?

Thin wrapper around s3 for blob storage

Checks

  • I've made sure the lint is passing in this PR.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • I've checked the new test coverage and the coverage percentage didn't drop.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

Comment on lines +25 to +42
func (b *BlobStore) StoreBlob(ctx context.Context, blobKey string, data []byte) error {
err := b.s3Client.UploadObject(ctx, b.bucketName, blobKey, data)
if err != nil {
b.logger.Errorf("failed to upload blob in bucket %s: %v", b.bucketName, err)
return err
}
return nil
}

// GetBlob retrieves a blob from the blob store
func (b *BlobStore) GetBlob(ctx context.Context, blobKey string) ([]byte, error) {
data, err := b.s3Client.DownloadObject(ctx, b.bucketName, blobKey)
if err != nil {
b.logger.Errorf("failed to download blob from bucket %s: %v", b.bucketName, err)
return nil, err
}
return data, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we eventually merge the fragmented uploading utility, we will want to revisit this code. A prerequisite to using fragmented uploads will be to store extra metadata for each blob (i.e. fragment size + blob size needed to download without round trips).

No need to wait to merge this PR. The fact that blobs are fragmeneted in S3 should be invisible to things utilizing this API. As long as we don't deploy this code to production environments, a switchover to the fragmented strategy won't be that tricky.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we want the fragmented uploading in the blob store. This is not the relay store that stores chunks, but the one that stores whole blobs. I think this simple wrapper can serve the blob store in the long run too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok waiting to have the discussion for how we want to store blobs. It's mostly a question as to whether we want to try and improve latency for really big blobs. Since we don't really support super big blobs currently, it might be something we can avoid dealing with for the MVP.

@ian-shim ian-shim merged commit 28f1709 into Layr-Labs:master Oct 29, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants