-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run cpu instruction calibration on a variety of hardware #1020
Comments
I talked to @anupsdf about this and we concluded two points:
So .. I'm going to take this and just run calibration on the x86-64 machine I have here. Doesn't matter what its clock frequency is, we're only talking instruction counts of the cost centers. |
Some investigation and results here (I meant to discuss this with @jayz22 but I'll make a note here for future reference too):
|
Posting my calibration results on m1 and x86 (- m1, + x86, full outputs attached below): - cost_type cpu_model_const_param cpu_model_lin_param mem_model_const_param mem_model_lin_param
- HostMemAlloc 1123 1 16 128
- HostMemCpy 32 24 0 0
- HostMemCmp 24 64 0 0
- DispatchHostFunction 262 0 0 0
- VisitObject 158 0 0 0
- ValSer 646 66 18 384
- ValDeser 1127 34 16 128
- ComputeSha256Hash 2877 4125 40 0
- ComputeEd25519PubKey 25640 0 0 0
- MapEntry 84 0 0 0
- VecEntry 35 0 0 0
- VerifyEd25519Sig 400983 2685 0 0
- VmMemRead 182 24 0 0
- VmMemWrite 178 25 0 0
- VmInstantiation 916377 68226 129471 5080
- InvokeVmFunction 1128 0 14 0
- ComputeKeccak256Hash 2882 3561 40 0
- ComputeEcdsaSecp256k1Key 37899 0 0 0
- ComputeEcdsaSecp256k1Sig 224 0 0 0
- RecoverEcdsaSecp256k1Key 1667731 0 201 0
- Int256AddSub 1714 0 119 0
- Int256Mul 2226 0 119 0
- Int256Div 2332 0 119 0
- Int256Pow 5223 0 119 0
- Int256Shift 415 0 119 0
- ChaCha20DrawBytes 4857 2461 0 0
+ cost_type cpu_model_const_param cpu_model_lin_param mem_model_const_param mem_model_lin_param
+ HostMemAlloc 310 0 16 128
+ HostMemCpy 52 0 0 0
+ HostMemCmp 55 36 0 0
+ DispatchHostFunction 239 0 0 0
+ VisitObject 34 0 0 0
+ ValSer 564 0 18 384
+ ValDeser 1104 0 16 128
+ ComputeSha256Hash 3943 6812 40 0
+ ComputeEd25519PubKey 40356 0 0 0
+ MapEntry 55 0 0 0
+ VecEntry 0 0 0 0
+ VerifyEd25519Sig 654651 4288 0 0
+ VmMemRead 210 0 0 0
+ VmMemWrite 209 0 0 0
+ VmInstantiation 459816 49469 129471 5080
+ InvokeVmFunction 1189 0 14 0
+ ComputeKeccak256Hash 4076 5962 40 0
+ ComputeEcdsaSecp256k1Key 58314 0 0 0
+ ComputeEcdsaSecp256k1Sig 249 0 0 0
+ RecoverEcdsaSecp256k1Key 2323402 0 181 0
+ Int256AddSub 1620 0 99 0
+ Int256Mul 2209 0 99 0
+ Int256Div 2150 0 99 0
+ Int256Pow 3925 0 99 0
+ Int256Shift 379 0 99 0
+ ChaCha20DrawBytes 2155 1051 0 0 The main differences are as @graydon pointed out, the memory related operations appear to be constant (with larger const factor) costs on x86. I believe this is what you are talking about? I think the analytical approach make sense. I've noticed some of those memory-related calibration results are pretty sensitive to the size of the sample (e.g. Re: cost type consolidation, I think it makes sense to consolidate some of those types, especially the {host, vm} mem-cmp/cpy/read/write ones. I will look into it further. (A bit of extra information, my x86 cpu is a Intel 2012Q2 model, with AVX (not AVX2) extention) |
Re: cost type consolidation and using analytical model
These are very crude analysis and is a bit stretching my low-level knowledge. @graydon let me know what you think. |
|
Re: cost type consolidation
|
Just had a conversation with @MonsieurNicolas. He expressed concerns about calibration numbers not being accurate and reproducible due to the advanced instruction set (e.g. AVX, AVX2).
I will give it a try. |
hmm. avx2 is 10 years old, there's nothing in the field that doesn't speak avx2. I am not sure this is really related to the constant-factor-ness of our measurements on those machines -- if we really want to correct that fact I think we should figure out why it's happening rather than just fiddling with codegen options (which none of our users will fiddle with anyways) |
A bunch of exploration of minor issues discovered in budget calibration, ostensibly about #1020 but also involving some tracy markup, some internal cleanup, machinery to allow excluding the overwhelming VM instantiation cost center, and some attempts at setting budget costs more from first principles.
### What - Update to the latest XDR, consolidating a few memory-related `ContractCostType` -- resolves #1020 - Resolve #1087 - Move some no longer used cost type calibrations (VecEntry, MapEntry) to an `experimental` directory, they are currently not used anywhere but would be useful for experimental purpose. Will do a followup to make them usable. - Fixed a bug in `memory_grow` function, where we were checking the limit against the wrong input - Add test helpers to make wasm memory alloc accessible - Various test fixes, clearifications ### Why [TODO: Why this change is being made. Include any context required to understand the why.] ### Known limitations Follow ups: - Refactor the analytical models - Make `experimental` directory usable - Recalibrate numbers on an x86 machine and update the parameters - Write a more complex test than the currently `complex`, use that for metering benchmark --------- Co-authored-by: Graydon Hoare <[email protected]>
What
Calibrate the cpu instructions on a variety of hardwares that the validators run on.
Why
The metering model is deterministic across all nodes. The model is currently calibrated on a single machine (M1) which may vary from actual hardware that validators use. This can make the actual compute time vary for the same amount of CPU instructions, which could affect ledger close time. The network resource limits need to be set conservatively w.r.t the worst case. We need to calibrate them on various hardware architecture in order to figure out the correct bonds.
The text was updated successfully, but these errors were encountered: