Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make badger-ds the default datastore #4279

Open
9 of 14 tasks
Stebalien opened this issue Oct 5, 2017 · 22 comments
Open
9 of 14 tasks

Make badger-ds the default datastore #4279

Stebalien opened this issue Oct 5, 2017 · 22 comments
Labels
epic kind/feature A new feature status/deferred Conscious decision to pause or backlog topic/badger Topic badger topic/meta Topic meta topic/repo Topic repo
Milestone

Comments

@Stebalien
Copy link
Member

Stebalien commented Oct 5, 2017

This is the master issue to centralize all other issues/PRs related to the Badger transition.

The priorities to check before the transition are:

  • DB integrity. That means, besides minimizing data loss in case of errors such as a system crash or running out of disk space, to always keep a consistent database for Badger (and IPFS) to be able to start. Even though some truncation for example might be needed, the scenario to avoid is for Badger to encounter a DB it can't work with (e.g., a failed assertion that Badger doesn't know how to recover from) and refusing to start (which would mean IPFS would not work) without some manual interaction (which can't be expected from the normal end-user).

  • Performance in worst case scenarios. We are transitioning from a flat file-system storage (one key, one file) which in most cases has a (much) lower performance than Badger, but there are some scenarios (e.g., GC or some search cases) where a flat architecture may outperform Badger (or any other LSM architecture for that matter), that should be minimized as much as possible so the end-user won't notice the transition.

The active issues (mostly the ones tagged with badger) are:

@Stebalien Stebalien changed the title Make badger-ds default Make badger-ds the default datastore Oct 5, 2017
@Stebalien Stebalien added kind/feature A new feature topic/repo Topic repo labels Oct 13, 2017
@whyrusleeping whyrusleeping added P0 Critical: Tackled by core team ASAP status/ready Ready to be worked labels Oct 17, 2017
@schomatis schomatis added the topic/badger Topic badger label May 3, 2018
@schomatis
Copy link
Contributor

@Stebalien Do you mind if I hijack this issue to keep track of all the other issues related to the Badger transition?

@Stebalien
Copy link
Member Author

@schomatis go right ahead!

@ajbouh
Copy link

ajbouh commented Jul 17, 2018

Is this still on track to happen sometime soon?

@Stebalien
Copy link
Member Author

Stebalien commented Jul 17, 2018

In addition to the issues listed in the description, we're still working through some recovery issues (not a bug) and memory usage is pretty bad (we may be able to tune this a bit but I'm getting some really weird behavior on Linux) (can't reproduce anymore).

Basically, we can't roll this out until:

  1. We can always recover after a crash.
  2. It doesn't eat ram needlessly.

@schomatis
Copy link
Contributor

Thanks for the reference, I should add those.

@Stebalien Stebalien added status/deferred Conscious decision to pause or backlog and removed status/ready Ready to be worked labels Dec 18, 2018
@djdv
Copy link
Contributor

djdv commented Jan 29, 2019

I doubt this effects most users but I'm linking it anyway.
My own instance of IPFS runs with flatfs hosted on an SMB/CIFS share.
badger doesn't currently handle this: dgraph-io/badger#699
although it can.

For full context, I do this because my local disks are small. And I can't run IPFS on the remote machine because components of libp2p don't build on Solaris yet.
(when trying to port it I encountered an oddity where the Go standard library says something is implemented but it isn't)

@Kubuxu
Copy link
Member

Kubuxu commented Feb 5, 2019

@djdv that is why we provide other datastore implementations and simple switches to initialise repo with different configurations.

@eingenito eingenito added the topic/meta Topic meta label Feb 11, 2019
@magik6k magik6k self-assigned this Feb 11, 2019
@Stebalien
Copy link
Member Author

@magik6k IMO, we should be able to graduate badger from experimental even if we don't go ahead and make it the default. However, we may want to land ipfs/go-ds-badger#51 first.

@ZerxXxes
Copy link

Hey, whats the status here? I see that ipfs/go-ds-badger#51 is closed, does that mean that the badger datastore is to be considered pretty mature now?

@Kubuxu
Copy link
Member

Kubuxu commented Nov 28, 2019

We should probably update to badger v2 before using it as default.

@dokterbob
Copy link
Contributor

Within this context I would like to point out the following issue, a huge problem regarding garbage not being collected: ipfs/go-ds-badger#54 (comment)

@Stebalien
Copy link
Member Author

Yep, I've added that to the list. Unfortunately, IIRC, badger v2 had its own issues so we're on to badger v3 now.

@dokterbob
Copy link
Contributor

May I inform on the progress on this one?

@BigLep BigLep moved this to 🥞 Todo in IPFS Shipyard Team Mar 3, 2022
@BigLep BigLep modified the milestones: go-ipfs 0.13, TBD Mar 3, 2022
@godcong
Copy link
Contributor

godcong commented Mar 4, 2022

GC doesn't actually work in v1: ipfs/go-ds-badger#54 (comment)

v1 seems to have stopped updating, and no one will probably ever address it in this entry.
Isn't it better to try another plan, like using v2 or v3 or something.

@guseggert
Copy link
Contributor

We will not make v1 the default as it is unmaintained, and v2 has issues as @Stebalien pointed out. DGraph, the company that maintains BadgerDB, is undergoing a leadership shake-up, so we're hesitant to make v3 the default until we are confident that v3 will be maintained in the long run.

@dokterbob
Copy link
Contributor

@guseggert Thanks for the update. Any news on this, by now?
Are you looking into alternative datastores?

@Jorropo
Copy link
Contributor

Jorropo commented Jun 13, 2022

Are you looking into alternative datastores?

flatfs

I want to write an LSM datastore with reflinking, but I wont work on this before #8201 is a thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic kind/feature A new feature status/deferred Conscious decision to pause or backlog topic/badger Topic badger topic/meta Topic meta topic/repo Topic repo
Projects
No open projects
Status: 🥞 Todo
Development

No branches or pull requests