Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intrinsic for the rdpmc instruction #10512

Open
mjsabby opened this issue Jun 13, 2018 · 10 comments
Open

Intrinsic for the rdpmc instruction #10512

mjsabby opened this issue Jun 13, 2018 · 10 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@mjsabby
Copy link
Contributor

mjsabby commented Jun 13, 2018

rdpmc is used to read hardware counter data. It is supported by most modern cpus from Intel and AMD.

MSVC++ has a __readpmc intrinsic and other compilers where such an intrinsic doesn't exist use inline assembly.

The signature of such an intrinsic looks like unsigned __int64 __readpmc(uint counterIndex);

where the counterIndex is the configured to the counter of interest.

To use this functionality from any privilege level (i.e. outside ring0), a kernel driver (or loaded module) needs to set the PCE bit on the CR4 register of each core where this functionality is desired. Once that is accomplished the MSRs for counters need to be configured to count the events of interest.

Obviously the preceding paragraph is outside the scope of CoreCLR and must be enabled or managed somehow by the user.

The proposal is to teach the rdpmc intrinsic to the JIT so that it can be accessed in an efficient way. To access this functionality today the cheapest way would be to have a native function that wraps the intrinsic and pinvoke into that, which for certain measurements perturbs what is being measured. But more annoyingly a new native dependency is required to achieve this.

This feature would be very valuable in getting crisp performance data from managed code directly.

category:design
theme:hardware-intrinsics
skill-level:intermediate
cost:large

@RussKeldorph
Copy link
Contributor

@mjsabby Can you close this and make a proposal via the API Review Process in the dotnet/corefx repo?

@RussKeldorph
Copy link
Contributor

@mjsabby
Copy link
Contributor Author

mjsabby commented Jun 13, 2018

@RussKeldorph the API needs to be added, so yes I can do that also, but this will require changes inside the runtime, or are you saying there is already a plan there?

@RussKeldorph
Copy link
Contributor

RussKeldorph commented Jun 13, 2018

@mjsabby The platform-instrinsics document I linked describes the process for adding this type of intrinsic. Ideally, you start with the API review process in corefx. If the proposal survives that process, then appropriate issue(s) are opened in other repos--most likely this one as you note. It's not the end of the world to leave this open until the API's fate is decided...just perhaps a little presumptive.

@tannergooding
Copy link
Member

@mjsabby, the generall process I've followed for cross-cutting changes is:

  1. File an API request on CoreFX, noting that it will require runtime changes and the API will need to live in S.P.Corelib
  2. Wait for approval
  3. Add the API to S.P.Corelib, referencing the CoreFX issue
  4. Wait for the change to be picked up in CoreFX
  5. Update the reference assembly in CoreFX
  6. Wait for the change to be picked up by CoreCLR
  7. Implement the CoreCLR side changes, with tests
  8. Close the issue

@mjsabby
Copy link
Contributor Author

mjsabby commented Jun 14, 2018

Thanks @RussKeldorph. I suppose I am being optimistic :) I'll follow what @tannergooding has outlined.

One discussion that I wanted to have was how we're going to test this. My thoughts were that the basic CI test would generate this instruction and get a General Protection Fault and say yup at least the instruction is being exercised and the platform not supported case. The harder part will be to test the success case because the default OS configurations don't allow accessing this instruction in user-mode and as far as I know we run these CI tests on VMs which don't have this emulated.

@tannergooding
Copy link
Member

@mjsabby, it would also be good to recommend the behavior of the instruction, with regards to serialization. Since it isn't guaranteed to be monotonic if a serialization instruction isn't inserted before/after.

That is, will we expect the user to insert serialization instructions themselves, or will it be part of the implementation...

@tannergooding
Copy link
Member

Also, determining how these would work on Linux/Mac would be good as well 😄

@morganbr
Copy link
Contributor

It would also be good to see a proposal for usage so we can understand what from rdpmc is useful. Perhaps a higher-level, architecture independent API is appropriate.

@mjsabby
Copy link
Contributor Author

mjsabby commented Jun 18, 2019

@tannergooding I think we'd have both a serialization and non-serialization variant.

@morganbr I think that could be a separate issue. I'd hate to lose RDPMC's full capability to make it architecture independent.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

No branches or pull requests

5 participants