Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting up a cluster with 8 nodes #24

Open
rohankadekodi opened this issue Jan 17, 2022 · 6 comments
Open

Setting up a cluster with 8 nodes #24

rohankadekodi opened this issue Jan 17, 2022 · 6 comments

Comments

@rohankadekodi
Copy link

Hello,

I wish to create an Assise cluster with 8 nodes, with 3-way replication of data. How do I set up the nodes with replicas manually? I am guessing that we need to make certain modifications in libfs/src/distributed/rpc_interface.h.

For example, should I set g_n_hot_reps = 8 (the total number of nodes in the cluster)?

How do I configure which SharedFS replicates which parts of the cached file system namespace?

Thanks for your help.

@wreda
Copy link
Contributor

wreda commented Jan 17, 2022

Our artifact currently doesn't support multiple namespaces, but you can likely still get this working with a few tweaks.

The quickest way is to create separate KernFS configurations for each replica group and assign them to distinct NVM namespaces. You should set g_n_hot_reps = 3 since you're doing 3-way replication. You might need to assign different port numbers to your KernFS instances if they're sharing the same machines. You can set this using the environment variable PORTNO for both LibFS and KernFS.

I'd be interested in adding proper support for namespaces, so any PR requests are also welcome here.

@rohankadekodi
Copy link
Author

rohankadekodi commented Jan 17, 2022

For a single namespace, is there a way to run with 4 or more machines, with 3-way replication of data in the current prototype? For example, say I want to run RocksDB or Filebench with 4 nodes, how can I configure the cluster?

I can give a shot at namespace support soon.

@wreda
Copy link
Contributor

wreda commented Jan 17, 2022

You can try setting g_n_hot_reps = 4 and then manually override the replication factor in LibFS by setting the environment variable MLFS_RF to 3. This will skip the last replica in the chain.

This is a bit hackish though, so no guarantee it'll work out-of-the-box. You can write back here if you run into issues, and I'll help debug.

@rohankadekodi
Copy link
Author

I tried this, and all the kernfs instances segfaulted at:
digest_logs()->digest_inode()->mlfs_mark_inode_dirty()->rb_insert()->inode_cmp()

I can also provide access to this cluster if that helps.

@wreda
Copy link
Contributor

wreda commented Jan 18, 2022

Thanks for the update. I'll check on my end first and get back to you.

@wreda
Copy link
Contributor

wreda commented Jan 19, 2022

I wasn't able to get this working properly yet, and it will likely require some non-trivial changes. I'd recommend for now that you limit your setup to the number of replicas.

I'll keep this issue open and will take another stab once I have free cycles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants