kvs: support splitting directories over multiple blobs #1206

garlick · 2017-09-26T17:18:52Z

Similar to issue #1202 on splitting valref objects over multiple blobs, RFC 11 defines a dirref object in terms of an array of blobrefs, but the implementation currently only supports one. As with valref, large directories should be split when transferred between the API and KVS service to avoid head of line blocking.

looking up a directory
When the lookup API functions request a directory with FLUX_KVS_READDIR, and the KVS service looks it up to find a multi-blob dirref, the KVS service should return the dirref rather than the assembled dir object. The client API should then load the pieces from the content store before assembling and decoding them, and fulfilling the lookup's future.

writing a directory
It is common to commit an empty dir (result of a mkdir), and currently I believe it would be allowed to commit a dir object that is non-empty, at least as far as the KVS service is concerned. However, as I'm not sure there is a use case for this, constructing an API that automatically splits large directories on the client end is probably not required.

manipulating directories during commits and lookups
More commonly, KVS service internals must assemble and decode directories in order to follow a path lookup or process a commit. Here, code is needed to split a directory across multiple blobs when it reaches a threshold size. This creates a complication for the KVS's internal cache that maps hash references to decoded JSON objects (such as directories). Possibly it means the dirref object, encoded, rather than simply the SHA-1 hash becomes the key to the cache entry? (Some design work needs to be done here).

really large directories
Splitting directories over multiple blobs in some respects creates additional overhead for large directories. For example, adding an entry to a multi-blob directory requres the entire object to be assembled and decoded, modified as JSON, and re-encoded, and dissembled. The RFC 11 hdir object is the proposed solution to this problem, covered in #1207

Internal design discussion points

how to key the cache for a multi-blob dir object
could we possibly dispense with this issue and jump directly to the hdir object when blob size threshold is reached? (would require hdir resize upon reaching blob size threshold in any given hash bucket)

The text was updated successfully, but these errors were encountered:

This was referenced Sep 26, 2017

kvs: support splitting values over multiple blobs #1202

Open

kvs: support very large directories without performance degradation #1207

Open

chu11 mentioned this issue Apr 4, 2019

job-manager: implement job archival #2093

Closed

chu11 mentioned this issue Nov 15, 2021

kvs: support mechanism to checkpoint and restore guest namespaces #3811

Closed

chu11 mentioned this issue Jul 19, 2024

kvs: investigate and prevent potential denial of service #6125

Open

chu11 mentioned this issue Nov 21, 2024

content: support RFC10 defined blob size limit #6457

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvs: support splitting directories over multiple blobs #1206

kvs: support splitting directories over multiple blobs #1206

garlick commented Sep 26, 2017 •

edited

Loading

kvs: support splitting directories over multiple blobs #1206

kvs: support splitting directories over multiple blobs #1206

Comments

garlick commented Sep 26, 2017 • edited Loading

garlick commented Sep 26, 2017 •

edited

Loading