Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fairly distribute disk I/O #9406

Closed
kylos101 opened this issue Apr 19, 2022 · 6 comments
Closed

Fairly distribute disk I/O #9406

kylos101 opened this issue Apr 19, 2022 · 6 comments
Labels
team: workspace Issue belongs to the Workspace team

Comments

@kylos101
Copy link
Contributor

kylos101 commented Apr 19, 2022

Bug description

We tried to limit here, in ws-daemon, but it is not currently working.

Steps to reproduce

Run a workload that demands fast read or write, you'll get speeds in excess of 300Mi! This can starve other workspaces (on the same node) of disk I/O.

Workspace affected

n/a

Expected behavior

Your workloads should have disk IO limited, to some extent, so that the node and other workspaces are not starved.

Example repository

No response

Anything else?

We are experimenting to see how nicely this works.

Also, getting the above to work may inform as to why our current IO limiter does not work.

Ultimately though, we need a solution, and it must fit nicely with workspace classes.

CC: @aledbf @Furisto @atduarte

@kylos101 kylos101 added the team: workspace Issue belongs to the Workspace team label Apr 19, 2022
@kylos101 kylos101 moved this to Scheduled in 🌌 Workspace Team Apr 19, 2022
@aledbf aledbf self-assigned this Apr 19, 2022
@kylos101 kylos101 changed the title Disk IO is not limited We need reliable disk I/O Apr 19, 2022
@kylos101 kylos101 changed the title We need reliable disk I/O We fair distribution of disk I/O Apr 19, 2022
@kylos101 kylos101 changed the title We fair distribution of disk I/O Fairly distribute disk I/O Apr 19, 2022
@kylos101 kylos101 moved this from Scheduled to In Progress in 🌌 Workspace Team Apr 19, 2022
@kylos101
Copy link
Contributor Author

Initial solve for this was done in #9440, we are currently monitoring.

@kylos101
Copy link
Contributor Author

@aledbf @Furisto is there anything outstanding for cgroup v1 or v2 that you can think of as it pertains to IO limiting?

Would it make sense to document cgroup support here for self-hosted?

cc: @corneliusludmann @mrsimonemms

@Furisto
Copy link
Member

Furisto commented Apr 25, 2022

@kylos101 IO limiting with cgroup v2 does not really work yet. The best we could say is that if you want IO limiting for self-hosted then you need a system with cgroup v1.

@kylos101
Copy link
Contributor Author

Thanks for the heads up, @Furisto ! I've removed this from scheduled work for now, as we've resolved the saas issue with #9440.

I think we need to think about the business value of cgroup V2. Given IOLimit and CPU limit are both not working as expected for cgroup V2, I wonder if we can do a skateboard of workspace classes on cgroup v1, and fix cgroup v2 later - in a follow-on epic. Wdyt?

CC: @csweichel @atduarte for awareness

@atduarte
Copy link
Contributor

@Furisto @kylos101 what is the current situation in SaaS, with cgroupsv2? I was under the impression that we have IO limiting in place.

@aledbf
Copy link
Member

aledbf commented Jul 28, 2022

Yes, that's right. It's working.

@aledbf aledbf closed this as completed Jul 28, 2022
@aledbf aledbf moved this to Awaiting Deployment in 🌌 Workspace Team Jul 28, 2022
@atduarte atduarte moved this from Awaiting Deployment to Done in 🌌 Workspace Team Jul 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team: workspace Issue belongs to the Workspace team
Projects
No open projects
Status: Done
Development

No branches or pull requests

4 participants