Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add specialization for 24MHz QueryPerformanceFrequency #3832

Merged
merged 9 commits into from
Jul 14, 2023

Conversation

fsb4000
Copy link
Contributor

@fsb4000 fsb4000 commented Jun 25, 2023

Fixes #3828

Fixes partially #3834

If I do

#define _LIKELY_ARM [[likely]]
#define _LIKELY_X86 [[unlikely]]
...
if (_Freq == _TenMHz) _LIKELY_X86 {
    static_assert(period::den % _TenMHz == 0, "It should never fail.");
    constexpr long long _Multiplier = period::den / _TenMHz;
    return time_point(duration(_Ctr * _Multiplier));
}
...

then clang-format format the code this way:

#define _LIKELY_ARM [[likely]]
#define _LIKELY_X86 [[unlikely]]
...

if (_Freq == _TenMHz) {
    _LIKELY_X86 {
        static_assert(period::den % _TenMHz == 0, "It should never fail.");
        constexpr long long _Multiplier = period::den / _TenMHz;
        return time_point(duration(_Ctr * _Multiplier));
    }
}

I'm not sure how well the Microsoft Visual C++ compiler's optimizer handles this, but clang-cl does a great job with inlining and simplfying the integer divisions into shifts and multiplies.

A repro that the MSVC compiler simplfying the integer divisions into shifts and multiplies: https://godbolt.org/z/zb5fob6xc

@fsb4000 fsb4000 requested a review from a team as a code owner June 25, 2023 05:49
@fsb4000

This comment was marked as resolved.

@achabense

This comment was marked as resolved.

@frederick-vs-ja

This comment was marked as resolved.

stl/inc/__msvc_chrono.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_chrono.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_chrono.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_chrono.hpp Outdated Show resolved Hide resolved
@StephanTLavavej StephanTLavavej added the performance Must go faster label Jun 30, 2023
@StephanTLavavej StephanTLavavej self-assigned this Jul 4, 2023
stl/inc/yvals_core.h Outdated Show resolved Hide resolved
stl/inc/__msvc_chrono.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_chrono.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_chrono.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_chrono.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_chrono.hpp Outdated Show resolved Hide resolved
stl/inc/__msvc_chrono.hpp Outdated Show resolved Hide resolved
@StephanTLavavej StephanTLavavej removed their assignment Jul 6, 2023
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do the push_macro/undef/pop_macro magic incantation here for likely and unlikely?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely and unlikely are commonly defined as function-like macros, but not object-like macros. <xkeycheck.h> avoids rejecting them for that reason.

Unlike msvc etc., users are technically not supposed to macroize likely and unlikely, so we technically don't need to defend against them. We could but I don't think it's necessary at the moment.

@StephanTLavavej StephanTLavavej self-assigned this Jul 13, 2023
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej merged commit ab57910 into microsoft:main Jul 14, 2023
@StephanTLavavej
Copy link
Member

Thanks for implementing this performance improvement! 🚀 🐇 🐆

@fsb4000 fsb4000 deleted the fix3828 branch July 14, 2023 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

steady_clock: specialization for 24MHz QueryPerformanceFrequency?
7 participants