-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable garbage collection in the default IPFS node #277
Comments
I'm in favor of @Wondertan's suggestion. It's certainly the simplest and probably the fastest as well. Later we can explore how to pin and tweak GC. |
It sounds like we're going to disable the garbage collection of IPFS, so I'm going to go ahead and change the title of this issue to reflect that. |
After code investigation, it turned out that GC is not enabled in IPFS by default and we were never actually using it, so I don't understand why we were using pinning at all. Maybe we misunderstood its purpose or whatever. Anyway, I am closing the issue and induce not to care about pinning henceforth. The only reason pinning exists is to prevent GC from cleaning stale data blocks. It makes sense for IPFS use cases, but not for ours, as we don't have a concept of "outdated" or "transient" data blocks. Even in case, some light client configuration would need to store chain blocks or square samples temporarily, the GC from IPFS won't help here. The place where IPFS daemon enables GC only if a flag provided. In case, IPFS is used as a library(our case) it never runs GC at all, and its configuration is ignored. Seems like, it is expected from a lib user to dig out the place somewhere deep in the package tree which actually runs GC through an undocumented API. |
Previously we were pinning all block data during proposal and during testing. This had so much overhead that it made the tests timeout (see #275). This can be solved by simply not pinning by default, but this outsources our data retention policy to IPFS's garbage collection. In the short term, this should work fine. Beyond the short term, we need to have a well defined default data retention policy.
@Wondertan suggested to turn off garbage collection. This has the benefits of no overhead by GC or pinning, and basically does the same thing as pinning (at least to my understanding).
@liamsi suggested not waiting for pinning to finish, or pinning at a different point in time.
If possible our strategy should also include some details on which ipld nodes to delete should disk space be X% filled. The ipfs docs mention
StorageGCWatermark
GCPeriod
but these only determine when GC gets turned on, not which ipld nodes to delete.The text was updated successfully, but these errors were encountered: