Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize calling GetTotalWorkerThreadPoolCompletionCount #1079

Closed
billwert opened this issue Dec 20, 2019 · 4 comments · Fixed by #1103
Closed

Optimize calling GetTotalWorkerThreadPoolCompletionCount #1079

billwert opened this issue Dec 20, 2019 · 4 comments · Fixed by #1103
Labels
area-System.Threading untriaged New issue has not been triaged by the area owner

Comments

@billwert
Copy link
Member

@ninedan reported that calling GetTotalThreadPoolCompletionCount is costing ~0.62% of all CPU usage in a large server scenario. It looks like we can easily move the following lines inside the check to for the thread adjustment interval, only paying this cost when we might use the answer.

@kouvel what do you think?

DWORD currentTicks = GetTickCount();
LONG totalNumCompletions = (LONG)Thread::GetTotalWorkerThreadPoolCompletionCount();
LONG numCompletions = totalNumCompletions - VolatileLoad(&PriorCompletedWorkRequests);

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Threading untriaged New issue has not been triaged by the area owner labels Dec 20, 2019
@kouvel
Copy link
Member

kouvel commented Dec 20, 2019

Yes these could be moved, but the method called here (GetTotalWorkerThreadPoolCompletionCount) is different from the one you mentioned (GetTotalThreadPoolCompletionCount), which one is showing up in the profile? The latter function is called as part of ThreadPool.CompletedWorkItemCount, if that's the one showing up in the profile then it could be polled less frequently.

@billwert
Copy link
Member Author

Ah, interesting. The .NET Framework code is slightly different and I didn't pay close attention when I was linking the source from this repo. Thanks for calling it out. It does look like it has changed here: dotnet/coreclr/pull/24113.

As this isn't a .NET Core scenario in the customer case I don't have a great barometer for how the new implementation will fare. Looking through the old and new code I suspect the real cost was ::FlushProcessWriteBuffers, which we don't seem to do now. The new code path does still do a fair bit of work which we should avoid if we aren't going to use the answer. I'll send a PR.

@billwert billwert changed the title Optimize calling GetTotalThreadPoolCompletionCount Optimize calling GetTotalWorkerThreadPoolCompletionCount Dec 21, 2019
@billwert
Copy link
Member Author

I got updated data which shows the actual cost under GetTotalThreadPoolCompletionCount was ThreadStore::GetAllThreadList, which is still in the code path here. (Presumably this is a scenario with very many threads.)

@kouvel
Copy link
Member

kouvel commented Dec 21, 2019

I see, makes sense, thanks

billwert added a commit to billwert/runtime that referenced this issue Dec 21, 2019
@ninedan reported this function as expensive in one of the server scenarios
he monitors. The actual time is spent in `ThreadStore::GetAllThreadList`.
Moving the call to `GetTotalWorkerThreadPoolCompletionCount` should reduce this
cost.

Fixes dotnet#1079
jkotas pushed a commit that referenced this issue Jan 6, 2020
@ninedan reported this function as expensive in one of the server scenarios
he monitors. The actual time is spent in `ThreadStore::GetAllThreadList`.
Moving the call to `GetTotalWorkerThreadPoolCompletionCount` should reduce this
cost.

Fixes #1079
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Threading untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants