Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker/IOCP Thread, Timeouts and Azure App Service #2311

Closed
DarthSonic opened this issue Nov 21, 2022 · 15 comments
Closed

Worker/IOCP Thread, Timeouts and Azure App Service #2311

DarthSonic opened this issue Nov 21, 2022 · 15 comments

Comments

@DarthSonic
Copy link

Hi,

as you write in your documentation and as best practice in several (issue) discussions, one should set the min. worker & iocp threads to "200 or 300". I tried to set them to 250 each for Azure App Service (aka Windows Web App). Unfortunately, as there is no way to edit applicationhost.config directly, setting it on runtime ThreadPool.SetMinThreads(minWorkerThreads, minIocpThreads) seems to be the only way to do this. But the values configured on runtime are ignored on Azure App Service (it works on each other server).

Do you know of any other way to do this configuration for Azure App Service?

@NickCraver
Copy link
Collaborator

To clarify here: we do not recommend this. A much better path is going async with your code. App Service respects this setting as much as any other platform, but the number of cores you're on will determine how much work you can do. The proper solution here is to go async so you're not holding onto threads while waiting on I/O, you'll have much better performance.

@DarthSonic
Copy link
Author

We do all Redis tasks async. The settings are ignored in App Service. So there is no other way to overcome those timeouts happening from time to time... :-(

@NickCraver
Copy link
Collaborator

What makes you think the settings are ignored in app service? I assure you that's not the case, but I'm curious what makes you think they are.

@DarthSonic
Copy link
Author

DarthSonic commented Nov 21, 2022

I can read the settings at runtime with GetMinThreads and while locally and on Windows Server the values I set are returned (250), on App Service the default values are returned (4).

@NickCraver
Copy link
Collaborator

Where are you calling the set method from?

@DarthSonic
Copy link
Author

Global.asax Application_Start(). It works everywhere but azure app service.

@DarthSonic
Copy link
Author

Interessting find: My main slot of the Azure App instance does not apply the min/max worker threads after I set those in Application_Start(). All other slots of the instance do! I don't know why this is the case, but this is what I can see currently.

@NickCraver
Copy link
Collaborator

@DarthSonic I'd be questioning if you've successfully deployed the version you think you've deployed (e.g. have a route that returns the version info). There's no reason for App Services to not respect these settings...it has to. For context here: I'm both a maintainer of this library and happen to be on the App Services team these days.

@DarthSonic
Copy link
Author

I am sure I deployed it the correct way. I added the Config change weeks ago. Now I found out that it is not respected on the default slot. I have opened a support ticket in Azure for that.

@DarthSonic
Copy link
Author

DarthSonic commented Nov 25, 2022

@NickCraver you can also see setting is ignored in StackExchange.Redis exception message:

[2022-11-25 04:01:40.572][Error]SetAndReleaseItemExclusive => StackExchange.Redis.RedisTimeoutException: Timeout performing EVAL (5000ms), next: EVAL, inst: 1, qu: 0, qs: 1, aw: False, bw: SpinningDown, rs: ReadAsync, ws: Idle, in: 0, in-pipe: 0, out-pipe: 25339, serverEndpoint: clevermatch.redis.cache.windows.net:6379, mc: 1/1/0, mgr: 10 of 10 available, clientName: CM_WebSessions, IOCP: (Busy=1,Free=999,Min=4,Max=1000), WORKER: (Busy=14,Free=32753,Min=4,Max=32767), v: 2.6.70.49541 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)

P.S.: 14 busy workers at night, outside business hours. So there should be much more during business hours needed.

@NickCraver
Copy link
Collaborator

NickCraver commented Nov 25, 2022

@DarthSonic I agree that for some reason you're not seeing what you expect here, but I wouldn't agree it's ignoring the setting. That just doesn't happen, it's fundamental to how the entire runtime works. It's likely that for some reason it's not actually being set (e.g. the deploy didn't do what you think), or something set it back to normal levels. My guess would be on the deploy, given you're seeing this on one machine and not all the others, that's a far more likely explanation.

@DarthSonic
Copy link
Author

I managed to get ThreadPool.SetMinThreads working. Do not exactly know why, but it started working.

Despite you are not recommending these settings, the requests to / response from Redis Server are much faster and we do not have any more timeouts at all.

Async was not the solution for the timeout issues.

@NickCraver
Copy link
Collaborator

For anyone finding this later: async is the correct solution here. Changing thread pool minimums only moves the threshold of failure while increasing overhead for all operations, meaning you get less performance on the same hardware and still have a ceiling to hit. It moved, it didn't disappear. The approach also doesn't scale appropriately across various core counts for where the code is running. It's petty much assured you'll still see elevated response times just not quite reaching timeout levels to throw an error. The spikes don't go away with this method.

As an example: making operations take 4900ms so they don't hit timeouts isn't solving the core problem, it just stops timeout errors that trigger at 5000ms. Slow operations and bottlenecking as a result of thread usage are still doing the bad things and causing overall slowness.

If you don't care about any of this: carry on. But at least be aware what is actually happening. Changing the setting is a bandaid that doesn't fix the actual problem and users should be clear about that.

@richardsjoberg
Copy link

According to this Microsoft article it's recommended to change the thread pool settings

https://learn.microsoft.com/en-us/azure/azure-cache-for-redis/cache-management-faq#important-details-about-threadpool-growth

@gabbsmo
Copy link

gabbsmo commented Nov 15, 2024

Is it possible that the thread pool starvation experienced by Azure users was fixed in #2664? Should Microsoft's guidance be updated? ping @NickCraver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants