Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object store clients: List() recursively #3784

Closed
pracucci opened this issue Feb 11, 2021 · 4 comments · Fixed by #3823
Closed

Object store clients: List() recursively #3784

pracucci opened this issue Feb 11, 2021 · 4 comments · Fixed by #3823

Comments

@pracucci
Copy link
Contributor

Is your proposal related to a problem?

Cortex internally uses the Thanos objstore clients. We have an use case for which we need to list objects recursively, basically setting delimiter="" instead of the default delimiter="/".

Describe the solution you'd like

I'm proposing to extend objstore.BucketReader interface adding the following:

type IterOptions struct {
    // When true the iteration ignores the '/' delimiter.
    Recursive bool
}

type BucketReader interface {
    IterWithOptions(ctx context.Context, dir string, options IterOptions, f func(string) error) error
}

Describe alternatives you've considered

  • Add options directly to Iter() but would break signature add extra boilerplate to add to each Iter() call
  • Add options are variadic at the end of Iter(), so IterWithOptions(ctx context.Context, dir string, f func(string) error, options ...IterOption) error: could work too, no big preference

Additional context

Quick analysis of supported backends:

  • S3
    • Minio already supports Recursive and we're currently hardcoding Recursive: false
  • GCS
    • storage.Query allows to customise the Delimiter and we're currently hardcoding it to / (setting it to empty string returns the whole tree under the provided prefix, which is what we want with recursive = true)
  • Azure
    • To list recursively we need to call ListBlobsFlatSegment() instead of ListBlobsHierarchySegment()
  • Filesystem
    • We could switch from ioutil.ReadDir() to filepath.Walk() and recursively enter directories as soon as we find them (to keep lexicographically order guarantee)
  • COS
    • If my chinese is good enough, according to the doc it's just about passing an empty delimiter (supported by the SDK)
  • OSS
    • Same here, according to doc an empty delimiter will list recursively
  • Swift
    • List objects with swift.ObjectsOpts{ Delimiter: 0 }
@pstibrany
Copy link
Contributor

Sounds like a nice extension to me (and something that we already use in Cortex, although not with Thanos bucket reader). Instead of hardcoding empty-delimiter, we could allow passing arbitrary delimiter, but support from various implementations may not be there.

Add options are variadic at the end of Iter(), so IterWithOptions(ctx context.Context, dir string, f func(string) error, options ...IterOption) error: could work too, no big preference

I like this option: It would stay just Iter, which is nice – no need to introduce new method, and no change required for existing callers of this method.

@pracucci
Copy link
Contributor Author

we could allow passing arbitrary delimiter

Minio doesn't allow it. Reason why I proposed Recursive bool (which would be supported by all backends as of today).

@bwplotka
Copy link
Member

Add options are variadic at the end of Iter(), so IterWithOptions(ctx context.Context, dir string, f func(string) error, options ...IterOption) error: could work too, no big preference

I think I would vote for this. The same method, just implementations has to be adjusted.

@bwplotka
Copy link
Member

Thank you for the amazing overview of all implementations. Happy to add recursion to APIs in such way objstore.WithRecursiveIter() or something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants