Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto extend node subnet #155

Open
4 tasks
giobart opened this issue Dec 8, 2023 · 4 comments · May be fixed by #156
Open
4 tasks

Auto extend node subnet #155

giobart opened this issue Dec 8, 2023 · 4 comments · May be fixed by #156
Assignees
Labels
enhancement New feature or request

Comments

@giobart
Copy link
Contributor

giobart commented Dec 8, 2023

Short

The node subnet is composed of 64 addresses. If a worker requires more addresses, we should perform another request to extend the subnetwork.

Proposal

At worker initialization time, the worker requires a net size of 64 as usual. Then, every time the address space is exhausted because we have more than 64 networked containers in a worker, we should extend it with a new request that proposes a second subnet to the worker.

One possible solution would
-> Request a new subnet if a subnet is exhausted inside env.generateAddress()
-> The new addresses can be stored inside env.addrCache

Ratio

Remove container limitations.

Impact

NetManager - Cluster manager(maybe)

Development time

1 week

Status

finding a solution

Checklist

  • Discussed
  • Documented
  • Implemented
  • Tested
@giobart giobart added the enhancement New feature or request label Dec 8, 2023
@giobart giobart self-assigned this Dec 8, 2023
@giobart
Copy link
Contributor Author

giobart commented Dec 8, 2023

@smnzlnsk what do you think about it?

@smnzlnsk
Copy link
Collaborator

smnzlnsk commented Dec 8, 2023

A couple points that popped up:

  • When would this be needed? When a cluster is completely out of options? Won't this cause a 'super' node if the services deployed are idle, but the scheduler decides to keep deploying on that one node because from a scheduling standpoint it seems fine? That one node would keep requesting address space, or is there an upper limit planned?
  • Are we planning on making the scheduler respect the available addresses of worker nodes?
  • Assuming we allow this and have a weak node. If that node keeps getting chosen by the scheduler and keeps deploying services, which suddenly all have a surge in traffic, causing the node to crash, won't this cause even more re-scheduling effort?

I think there are a lot of variables we need to respect, before going ahead with this. In general this seems like a good idea, though, iff the node will be able to withstand higher strain in the future (and the scheduler does not discriminate).

@giobart
Copy link
Contributor Author

giobart commented Dec 11, 2023

Ideally, I think that we should not be limited by addressing space but by actual resources. If a node has run out of resources, the scheduler will not (or should not) send new deployments regardless. If, instead, a node is capable of handling new workloads according to the SLA but has run out of addresses, it should request more, I guess.

@giobart giobart linked a pull request Dec 11, 2023 that will close this issue
@giobart giobart linked a pull request Dec 11, 2023 that will close this issue
@giobart
Copy link
Contributor Author

giobart commented Dec 11, 2023

@smnzlnsk What do you think of the solution in #156?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants