-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
linstor-satellite restart leads to linstor-controller overutillization #391
Comments
Please upgrade to at least 1.26.1 and see if this issue persists. In the said version we tried to fix a bug that could lead to such a behavior. |
Updated at the 1.26.2 - the same behaviour |
and the same thing using mysqld galera cluster as a database |
switching back to H2 seems to resolve the problem |
If this is reproducible and you are willing to test this further, can you trigger the controller into this state and poke it a few times with Additionally you could also activate TRACE logging for the controller and then trigger this behavior. Feel free to send me the resulting SOS report to the email from my profile |
Here is my sos-report - ive run kill -3 few times just after all satellites restart. The same picture - Controller ate all cpu |
After updating to 1.27.1 and mariadb backend, we still can see this issue. Sometimes, after restart of linstor satellite or crash of some node with satellite, linstor controller stuck with very high CPU consumption and doesn`t respond to any command. |
Hi! Im using 1.25.1 version Linstor + etcd on separate nods as database. around 100 diskless nodes and 10 storage nodes. Total around 1.5K resources
Every time I restart satellite (any) - linstor controller goes mad eating every cpu possible via threads. Stracing Controller shows tons of futexes all over the spawned threads
[pid 1910062] futex(0x7f82495fd77c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 1910061] futex(0x7f82495fa0c8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 1910060] futex(0x7f82495f8678, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 1910059] futex(0x7f82495f6a68, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 1910058] futex(0x7f82495f4c98, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 1910057] futex(0x7f82495f2ed8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 1910056] futex(0x7f82495f12c8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 1910055] futex(0x7f82495ef518, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 1910054] futex(0x7f82495ed908, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 1910053] futex(0x7f82495ebcf8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 1910052] futex(0x7f82495ea0e8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
Attaching the htop output at controller server during linstor-satellite restart
The text was updated successfully, but these errors were encountered: