Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VolumeSpec value can bloat as cluster size (number of nodes) increases due to allowed_nodes and preferred_nodes #1154

Closed
hasethuraman opened this issue Jul 7, 2022 · 3 comments · Fixed by openebs/mayastor-control-plane#275
Assignees
Labels
NEW New issue

Comments

@hasethuraman
Copy link

Describe the bug
Created a cluster with 4 nodes (1 master and 3 agents) and I see all nodes are added to these allowed_nodes and preferred_nodes. So I think when we increase the cluster size and no topology information that opens up the chance to capture all nodes in these sections. Having said that, when we have x1000's of volumes in such large cluster, this can potentially increase the disk usage of etcd and increase the latency overall.

/namespace/mayastor/control-plane/VolumeSpec/38098332-3acc-4850-874b-a5315acf3dce
{
"uuid": "38098332-3acc-4850-874b-a5315acf3dce",
....
"topology": {
"node": {
"Explicit": {
"allowed_nodes": [
"k8s-agentpool1-40851847-0",
"k8s-agentpool1-40851847-1",
"k8s-master-40851847-0",
"k8s-agentpool1-40851847-2"
],
"preferred_nodes": [
"k8s-agentpool1-40851847-2",
"k8s-master-40851847-0",
"k8s-agentpool1-40851847-0",
"k8s-agentpool1-40851847-1"
]
}
}...
}

To Reproduce
Steps to reproduce the behavior:
I hope the above sample can explain how to do that.

Expected behavior
A clear and concise description of what you expected to happen.
Should we really capture all the nodes and only capture the nodes where the replicas are present?

Screenshots
If applicable, add screenshots to help explain your problem.

** OS info (please complete the following information):**

  • Distro: [e.g. NixOS]
  • Kernel version
  • MayaStor revision or container image : develop

Additional context
Add any other context about the problem here.

@tiagolobocastro
Copy link
Contributor

@hasethuraman I'm not sure why we're doing this, seems we're conflating accessibility for the application with data placement, unless I'm misunderstanding.
I can't think of a reason to keep it so it's probably safe to omit these nodes until we have such need.

@hasethuraman
Copy link
Author

Thanks @tiagolobocastro. Please let me know when you have any update on fix and timelines.

I may be probably wrong with this suggestion - instead of omitting the nodes completely, I think the necessary nodes (where replicas) can be there and omit the rest of the nodes in that array. This information may be helpful to the admin to query ETCd/mayastor to know the location of the replicas. Since I am not familiar with mayastor and my suggestion is completely wrong or doesnt add any value here, please ignore this.

@hasethuraman
Copy link
Author

I think this would be a better way:

If there is a topology information (user's desire) can be sent to CSI through storageclass or pvc, Mayastor's VolumeSpec can have a new toplogy field to capture this topology-info (which can be consumed in future. for example: a cluster restart, scale-out) and avoid allowed_nodes and preferred_nodes.

If topology information is not provided, then it means all nodes are eligible to provision the replica, still allowed_nodes, preferred_nodes can be [] and topology key will be nil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NEW New issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants