-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List of performance regressions caused by switching to ICU #40942
Comments
System.Memory.ReadOnlySpan.IndexOfString 5 regressions, 5 improvements
Comment: |
Thank you @adamsitnik good idea I'm glad you created this. |
I have this PR opened for addressing all ordinal operations #40910 |
@adamsitnik are you going to close all other bugs complaining about ICU perf against this one? |
That's my plan. I will try to do my best to do it for as many as I can. |
Also, I want to be clear about the expectation here. I don't think we can fix all perf here as we are limited by calling ICU. we can look at how we can improve it but I am not expecting to get the perf to the point where we used to call NLS. So, it will be good to decide which items in this list is a blocker. The only one was the Ordinal cases which I am addressing in the attached PR. I am not aware of any other blocking scenario. We'll look more of course on other scenarios anyway but I am not sure how much we can do before 5.0 release. |
System.Globalization.Tests.StringHash 2 regressions, 4 improvements
Comment: there are two regressions for |
System.Globalization.Tests.StringEquality: 8 regressions, 14 improvements
Comment: ICU seems to be faster when inputs are the same or the first character is different but slows down when we are comparing lowercase and uppercase versions of the same words |
System.Globalization.Tests.StringSearch: 30 regressions, 32 improvements
|
System.Globalization.Tests.Perf_DateTimeCultureInfo.Parse: 1 big regression for
Comment: this is a known ICU issue: #31273 |
@tarekgh I've gone through all |
@Symbai I see you've thumbs-down. Could you share your concerns? |
@danmosemsft Its not about adamsitnik work of collecting the regressions which I find very helpful. Its about the fact that switching to ICU has introduced MAJOR regressions (+4,500ns, ~ 4x slower) while the "improvements" are usually around ~2ns which I bet is only some noise and running them a couple of times will show there are no improvements at all. I'm not into ICU and I dont know why this change was made at all but seeing all the performance improvements in .NET Core the last versions and now such a big regression on hot code path that is literally being used anywhere, makes me really wonder what in god's name can be that much useful about ICU that it worth THIS regression. I've thumbs-down because I disagree on this ICU change and I disagree on statements which say that "we might not fix some of these regressions". Especially without telling people that switching to .NET 5 will make their code much more slower without a benefit (there might be a benefit in ICU, but I bet it won't affect most people... unlike this performance regression). |
@Symbai you have the option to switch back to use NLS if you want to. ICU is the future direction in general for the .NET and Windows too. ICU will give the opportunity to have a consistency between OS's and OS versions. The benefit of using ICU is really worth it. ICU will give opportunity to the apps to customize the globalization behavior too. |
Also note that this is already what is used on Linux and Mac, and increasingly used within Windows OS itself. So we are aligning with the industry here. If and where it is slow - we all benefit from making it faster. The .NET team have contributed bug reports and performance improvements to libicu in the past, and I expect we will do so again. |
As a side note, ICU is open source and we can contribute to make hot paths faster |
also cc @lewing (although I guess his focus is just size) |
For information only. PowerShell 7.1 was released on .Net 5.0 and we already got performance feedback PowerShell/PowerShell#14087 |
@iSazonov did you have chance to investigate more and know where is the regression is coming from? Also, could you please measure the same scenario on .NET 6.0? we have some more optimization there it may help? |
@danmosemsft we care about performance a lot too, we just had disproportionately large (small?) size needs here. @tarekgh great work on all this. PGO of the unmanaged code looks complicated in our case because we'd need a way for our compiler/toolchain to consume the the profile. |
@tarekgh I believe it is an expected regression after moving to ICU. I shared the feedback to demonstrate what users can see in real scenarios.
PowerShell repository is still on .Net 5.0. I guess PowerShell MSFT team will move to .Net 6.0 after .Net 6 Preview1 will be available to them. Then we could compare. If users see such a noticeable regression, I would expect your latest improvements to be ported to .Net 5.0. As for PGO, I think this could be a good step towards reaping benefits quickly with minimal cost. I assume you already have these statistics from MSFT services. But in general it is obvious which functions are the most used. More important is how this performance is achieved. I see that there is a desire to make Runtime more compact. This is important for scenarios like WASM. But in other cases, this is not critical. If hardware intrinsics use double and triple implementations, why not do the same for strings? - a compact but slower implementation for WASM and etc. and a faster but more capacious implementation for other scenarios - it is not problem for PowerShell script execution to have additional 10 buffers by 20 Kb or even by 1 Mb. |
@iSazonov I am still unclear what exactly regressed in the user scenario? did you investigate PowerShell and looked where the regression is happening? what calls to .NET is regressed? |
the referenced issue (PowerShell/PowerShell#14087) mentions:
So it could be
It should take less than a minute to handle 85 MB text file. We might be able to provide some recommendations as well. |
I've looked into the PowerShell trace files and it looks like the regression is not related to ICU, but We should expect a new issue to be created soon by PowerShell contributors (PowerShell/PowerShell#14087 (comment)) |
To tell, @jefgen thankfully is in doing some optimization work from the ICU side unicode-org/icu#1471 You can see the perf numbers in the ticket https://unicode-org.atlassian.net/browse/ICU-21388 |
@jefgen very nice improvements! did Windows Team consider using PGO (Profile Guided Optimization) to improve ICU perf? |
Thanks!
Yes the version of ICU that ships as part of the OS with Windows ( That said, the pre-built Windows binary downloads for the public ICU don't use PGO though. TBH, I'm not super familiar with PGO though, so I wonder if it might be good to sync up to discuss more sometime perhaps?. |
I measured PowerShell startup scenario. Command line is Update: I found another slow ICU initialization in the scenario - we mistakenly called string.StartsWith() with a default StringComparison.CurrentCulture - it calls slow |
To update the current status I have done the latest measurement comparing the 3.1 (the Base) against 6.0 (the diff). Usually I experience some instability in the results of IsPrefix and IsSuffix on my machine so maybe the numbers of these tests are not accurate. Considering 6.0 faster in some scenarios and slower in other scenarios, we may need to look more at the slower scenarios. But in general, I think we are in much better state even we can do more.
|
@adamsitnik @tarekgh Friendly ping to see if there's any progress with resolving these ICU related regressions, since it's now the default library. Seeing this on my end, for example:
Thanks |
@L2 We are not actively looking at this at the current time for other high-priority work. Using options like IgnoreSymbols is expected to have more cost because it is done with the slow path. Could you please talk more about your scenario to try to see if there is any workaround for now? |
Thanks, @tarekgh . It was mainly due to the difference seen for these benchmarks when comparing the previous NLS library to the now default ICU library. I've tried setting DOTNET_SYSTEM_GLOBALIZATION_USENLS and can confirm this gives back the original perf. Currently on my system (win10 x64 with latest updates), the Do you know if there's any plans to roll out an updated Thanks |
I think Windows 10 should have a higher version of ICU than what you have listed. @jefgen could you please comment on @L2 question #40942 (comment)? |
The story is unfortunately a bit complicated... What this means is that Windows 10 is still on ICU version 64.2, while Windows 11 is on ICU version 68.2.
I don't think there are currently any plans to push an updated version of ICU via Windows update at this time.
AFAIK, the OS binaries aren't built using the public compiler. However, I'm not sure off-hand what exact version of the |
Thanks @jefgen for the details. @L2 you still have the option to use the ICU app-local feature to use the latest ICU version if you want. <ItemGroup>
<RuntimeHostConfigurationOption Include="System.Globalization.AppLocalIcu" Value="68.2.0.6" />
<PackageReference Include="Microsoft.ICU.ICU4C.Runtime" Version="68.2.0.6" />
</ItemGroup> |
Before 5.0, we were using ICU only on Unix systems. In 5.0 we have decided to use it on Windows by default as well.
This is something that we have done in order to have the same behavior of string-related globalization APIs on every OS.
However, this particular change has affected the performance characteristics of many frequently used methods. Some of them have regressed, some have improved.
Recently we have reported a lot of 5.0 regressions related to that. Since we have done this on purpose and we are most probably not going to revert the switch, I am opening this issue to track the list of all known regressions. When the list becomes complete, we are most probably going to update the 5.0 release docs and close the issue and label it as
wont fix
.Please feel free to edit the list.
Known changes:
cc @danmosemsft @tarekgh @billwert @DrewScoggins @GrabYourPitchforks @jkotas @safern
The text was updated successfully, but these errors were encountered: