Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC has not collected an expired object #2858

Closed
roman-khimov opened this issue May 30, 2024 · 2 comments · Fixed by #2863
Closed

GC has not collected an expired object #2858

roman-khimov opened this issue May 30, 2024 · 2 comments · Fixed by #2863
Assignees
Labels
bug Something isn't working I4 No visible changes neofs-storage Storage node application issues S4 Routine U2 Seriously planned
Milestone

Comments

@roman-khimov
Copy link
Member

Expected Behavior

Expired objects are deleted. Even if the node was down for some time, it should delete them afterwards.

Current Behavior

May 30 14:54:27 metis2 neofs-node[656]: 2024-05-30T14:54:27.505Z        error        replicator/process.go:76        could not replicate object        {"component": "Object Replicator", "node": "03aeff8a19f0202090afb0916b1c00b432321be7e8623a06c9b9b5db8ee5c053a4", "object": "HXSaMJXk2g8C14ht8HSi7BBaiYZ1HeWh2xnWPGQCg4H6/6Ha2WhpvBsAeT23jsa9Ak8DVrqVB97xLXTSeLrjYnoGD", "error": "copy object using NeoFS API client of the remote node: status: code = 1024 message = failed to verify and store object locally: validate object format: object did not pass expiration check: object has expired"}

It tries to replicate the same set of objects again and again which means GC has failed to do its job.

Possible Solution

Fix GC.

Steps to Reproduce (for bugs)

Shut an existing node down for some time and expand it with one shard (3->4).

Your Environment

  • Version used: 0.42.0
  • Server setup and configuration: mainnet
  • Operating System and version (uname -a): Debian stable
@roman-khimov roman-khimov added bug Something isn't working neofs-storage Storage node application issues U2 Seriously planned S4 Routine I4 No visible changes labels May 30, 2024
@roman-khimov roman-khimov added this to the v0.43.0 milestone May 30, 2024
@roman-khimov
Copy link
Member Author

May 30 17:58:59 metis4 neofs-node[4408]: 2024-05-30T17:58:59.789Z        error        replicator/process.go:76        could not replicate object        {"component": "Object Replicator", "node": "03aeff8a19f0202090afb0916b1c00b432321be7e8623a06c9b9b5db8ee5c053a4", "object": "HXSaMJXk2g8C14ht8HSi7BBaiYZ1HeWh2xnWPGQCg4H6/HmECRDC25qyX8MxNHeA21PK8aymLZcPSYQKyLBNBiTUM", "error": "copy object using NeoFS API client of the remote node: status: code = 1024 message = failed to verify and store object locally: validate object format: object did not pass expiration check: object has expired"}

@carpawell
Copy link
Member

carpawell commented May 31, 2024

They are parts of a big (V1, hehe) object. Needed to be investigated but it seems to me that happened smth like this:

  1. V1 parts had (or had to have but did not?) expiration attr in every part
  2. There is no expiration for its parts in GC: Implement API #390 (at least I do no see it), only handling small objects or GC mark for root object and "dropping" non-existing big object from the blobstor
  3. The new V2 object scheme landed and brought a nice check for replicated small objects (both for V1 and V2 big objects):
    if obj.HasParent() {
    if splitID != nil {
    // V1 split
    if firstSet {
    return errors.New("v1 split: first object ID is set")
    }
    } else {
    // V2 split
    if !firstSet {
    // first part only
    if obj.Parent() == nil {
    return errors.New("v2 split: first object part does not have parent header")
    }
    } else {
    // 2nd+ parts
    typ := obj.Type()
    // link object only
    if typ == object.TypeLink && (par == nil || par.Signature() == nil) {
    return errors.New("v2 split: incorrect link object's parent header")
    }
    if _, hasPrevious := obj.PreviousID(); typ != object.TypeLink && !hasPrevious {
    return errors.New("v2 split: middle part does not have previous object ID")
    }
    }
    }
    }
  4. The check started to work and does not allow expired parts replication but GC still does not expire small parts.

The main solutions, for now, should include the following: see an expired big object and expire its every part, not just mark the root object as deleted.

carpawell added a commit that referenced this issue Jun 10, 2024
Closes #2858.

Signed-off-by: Pavel Karpy <[email protected]>
carpawell added a commit that referenced this issue Jun 10, 2024
Closes #2858.

Signed-off-by: Pavel Karpy <[email protected]>
carpawell added a commit that referenced this issue Jun 10, 2024
Closes #2858.

Signed-off-by: Pavel Karpy <[email protected]>
carpawell added a commit that referenced this issue Jun 10, 2024
An object can expire while waiting for Replicator's attention. Also, it would
have helped in problems like #2858.

Signed-off-by: Pavel Karpy <[email protected]>
carpawell added a commit that referenced this issue Jun 11, 2024
Closes #2858.

Signed-off-by: Pavel Karpy <[email protected]>
carpawell added a commit that referenced this issue Jun 11, 2024
An object can expire while waiting for Replicator's attention. Also, it would
have helped in problems like #2858.

Signed-off-by: Pavel Karpy <[email protected]>
@roman-khimov roman-khimov modified the milestones: v0.43.0, v0.42.1 Jun 13, 2024
roman-khimov added a commit that referenced this issue Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working I4 No visible changes neofs-storage Storage node application issues S4 Routine U2 Seriously planned
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants