Splitstore: Rework the Compaction algorithm to eliminate sorting #7137

vyzo · 2021-08-19T12:45:50Z

Context

The splitstore compaction algorithm currently includes a sorting step, whereby objects to be deleted are kept in memory and sorted so that the constituents of an object are never deleted before the object itself.
This ensures that there can be no dangling references, which is a critical safety property for two reasons:

the VM cannot recreate an object with dangling references during compaction
compaction is crash-safe -- there are no danglig references if lotus crashes during cold object purge.

The Problem

Unfortunately, this important property does not come for free:

it takes a long time (up to 50% of compaction time) to sort objects, as computing object weights needs to walk all those objects and recursively compute their weights.
it uses a lot of transient memory, which is linear in the size of the cold object set. This creates problems and can lead to OOMs in memory constrained situations (eg lotus running with 32G or less RAM). Furthermore, it makes it very difficult or impossible to recover from long down times where the node needs to resync -- see Issues with syncing from scratch and long resyncs in the splitstore #6769.

Sort-free Compaction

With the introduction of on-disk marksets, we have pieces of the solution already.

The key property of the reworked algorithm is that we consider marking atomic, Once we are done marking, everything not in the markset is considered deleted and the VM must recreate the object, which can now be tracked in the markset; this can be accomplished at the interface boundary.

The basic idea is that we keep both the cold set and the markset on disk, flushed at the end of marking, and recovery of a failed purge if we happen to crash during the critical section (when we actually delete objects in the hotstore).

The critical property we must ensure is that the markset is always flushed to disk before we start deletion and that the markset themselves are crash-safe. This will need some work in our current implementation, both in the map markset backend so that it is flushed on disk and in the badger markset to flush and directly write once we have flushed.

Another observation is that with cold object reification (see #6726) we only need to do this when we operate with the discard coldstore.
It is fine to leave dangling references in the hotstore if we do have a coldstore to fallback on, as these will be fixed once there is an access to them with reification -- a nice self-healing property.

vyzo self-assigned this Aug 19, 2021

vyzo added epic/splitstore team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs labels Aug 19, 2021

jennijuju added this to the v1.13.3 milestone Dec 8, 2021

jennijuju removed the epic/splitstore label Dec 13, 2021

jennijuju modified the milestones: v1.13.3, v1.15.0 Jan 2, 2022

jennijuju modified the milestones: v1.15.0, v1.15.1 Jan 17, 2022

vyzo mentioned this issue Feb 2, 2022

splitstore sortless compaction #8008

Merged

magik6k closed this as completed in #8008 Feb 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Splitstore: Rework the Compaction algorithm to eliminate sorting #7137

Splitstore: Rework the Compaction algorithm to eliminate sorting #7137

vyzo commented Aug 19, 2021

Splitstore: Rework the Compaction algorithm to eliminate sorting #7137

Splitstore: Rework the Compaction algorithm to eliminate sorting #7137

Comments

vyzo commented Aug 19, 2021

Context

The Problem

Sort-free Compaction