perf: replace `std::pow(x, 0.25)` with std::sqrt(std::sqrt(x)) #1150

paulgessinger · 2022-02-08T17:35:56Z

powf64 is relatively slow. This change improves the performance of ActsBenchmarkEigenStepper by about 10%, the performance impact on the more real-world propagation with navigation (as in the propagation example with the generic detector), seems negligible.

Overall I think it's probably still worth adding.

codecov · 2022-02-08T18:12:40Z

Codecov Report

Merging #1150 (c716eee) into main (f9dbc02) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1150   +/-   ##
=======================================
  Coverage   47.89%   47.89%           
=======================================
  Files         359      359           
  Lines       18502    18504    +2     
  Branches     8730     8730           
=======================================
+ Hits         8861     8863    +2     
  Misses       3605     3605           
  Partials     6036     6036

Impacted Files	Coverage Δ
Core/include/Acts/Propagator/EigenStepper.ipp	`50.00% <100.00%> (+0.75%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f9dbc02...c716eee. Read the comment docs.

HadrienG2

It's a bit puzzling to me that we have two almost but not quite identical versions of the step size adaptation formula. But I'm happy to see this optimization land.

paulgessinger · 2022-02-09T07:46:00Z

Copy paste error?

Also, I'm a bit surprised that the root file output changes. I'd have thought that a slight numerical change in the step size scaling would have a negligible effect on the output...

asalzburger · 2022-02-09T07:51:48Z

It's a bit puzzling to me that we have two almost but not quite identical versions of the step size adaptation formula. But I'm happy to see this optimization land.

This doesn't surprise me, I think this is really a numerical difference from slight change of step size.

benjaminhuth · 2022-02-09T15:09:55Z

Out of interest I've played a bit around with different versions of this, and I found that

std::sqrt(std::sqrt(static_cast<float>(x)));

is about twice as fast as using the double version. As I understand we do not require large precision here, so that could speed this even up a bit?

(The values I measured by just computing the result out of many sqrts(sqrts(...)) in a row, but without vectorization and with -O2)

HadrienG2 · 2022-02-09T15:29:13Z

@benjaminhuth This throughput difference indeed expected, https://www.agner.org/optimize/instruction_tables.pdf states that on modern CPUs the underlying SQRTSS and SQRTSD hardware instructions exhibit a 2x throughput difference in the worst-case scenario. I also agree that some computations here could likely be safely moved to single precision.

paulgessinger · 2022-02-10T15:49:51Z

I'd probably say using this static cast to float is a good idea. I don't think we care about numerical precision in these cases.

paulgessinger · 2022-02-10T15:57:41Z

Switched to sqrt(sqrt(static_cast<float>(x))). The CI should fail still because of the output hashes, I'll fix those from the CI values.

stephenswat · 2022-02-13T23:41:51Z

Depending on the frequency at which the clamping condition between 0.25 and 4.0 occurs, it might also be beneficial to rewrite the clamp explicitly:

float r = state.options.tolerance / std::abs(2. * error_estimate);

if (r <= 0.00390625f) { // 0.25^4
    stepSizeScaling = 0.25;
else if (r >= 256.f) { // 4.0^4
    stepSizeScaling = 4.0;
} else {
    stepSizeScaling = std::sqrt(std::sqrt(r));
}

Hard to say if you'll benefit without knowing how often this fires, though. 🤷‍♂️

paulgessinger · 2022-02-14T13:58:33Z

@stephenswat that could be an improvement, but might indeed be overkill.

asalzburger · 2022-02-18T07:49:25Z

This has conflicts now because of the updated file I suppose ...

paulgessinger · 2022-02-18T10:47:01Z

Updated the hashes again. Let's see.

paulgessinger · 2022-02-21T08:34:20Z

This is green now. Do we merge (/ can you approve)? @benjaminhuth @HadrienG2 @stephenswat ?

acts-project#1150)" This reverts commit de6af39.

replace std::pow (powf64) with std::sqrt(std::sqrt)

4d9bb16

paulgessinger added this to the next milestone Feb 8, 2022

HadrienG2 approved these changes Feb 9, 2022

View reviewed changes

Merge branch 'main' into perf/stepsize-scaling

41e3410

switch to floats

2140735

update hashes

8423d75

paulgessinger and others added 2 commits February 14, 2022 11:07

update another hash

732755a

something is wonky with the vertexing test output

2e63569

paulgessinger and others added 2 commits February 18, 2022 09:43

Merge branch 'main' into perf/stepsize-scaling

33464f5

Update root_file_hashes.txt

610dc46

stephenswat approved these changes Feb 21, 2022

View reviewed changes

paulgessinger added automerge Improvement Changes to an existing feature labels Feb 21, 2022

Merge branch 'main' into perf/stepsize-scaling

c716eee

kodiakhq bot merged commit de6af39 into acts-project:main Feb 21, 2022

paulgessinger deleted the perf/stepsize-scaling branch February 22, 2022 08:13

paulgessinger modified the milestones: next, v17.1.0 Mar 1, 2022

paulgessinger added a commit to paulgessinger/acts that referenced this pull request Mar 10, 2022

Revert "perf: replace std::pow(x, 0.25) with std::sqrt(std::sqrt(x)) (

d41c0b3

acts-project#1150)" This reverts commit de6af39.

AJPfleger mentioned this pull request Apr 22, 2024

refactor: Accumulated EigenStepper brush-over #3130

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: replace `std::pow(x, 0.25)` with std::sqrt(std::sqrt(x)) #1150

perf: replace `std::pow(x, 0.25)` with std::sqrt(std::sqrt(x)) #1150

paulgessinger commented Feb 8, 2022

codecov bot commented Feb 8, 2022 •

edited

Loading

HadrienG2 left a comment

paulgessinger commented Feb 9, 2022

asalzburger commented Feb 9, 2022

benjaminhuth commented Feb 9, 2022

HadrienG2 commented Feb 9, 2022

paulgessinger commented Feb 10, 2022

paulgessinger commented Feb 10, 2022

stephenswat commented Feb 13, 2022

paulgessinger commented Feb 14, 2022

asalzburger commented Feb 18, 2022

paulgessinger commented Feb 18, 2022

paulgessinger commented Feb 21, 2022

perf: replace std::pow(x, 0.25) with std::sqrt(std::sqrt(x)) #1150

perf: replace std::pow(x, 0.25) with std::sqrt(std::sqrt(x)) #1150

Conversation

paulgessinger commented Feb 8, 2022

codecov bot commented Feb 8, 2022 • edited Loading

Codecov Report

HadrienG2 left a comment

Choose a reason for hiding this comment

paulgessinger commented Feb 9, 2022

asalzburger commented Feb 9, 2022

benjaminhuth commented Feb 9, 2022

HadrienG2 commented Feb 9, 2022

paulgessinger commented Feb 10, 2022

paulgessinger commented Feb 10, 2022

stephenswat commented Feb 13, 2022

paulgessinger commented Feb 14, 2022

asalzburger commented Feb 18, 2022

paulgessinger commented Feb 18, 2022

paulgessinger commented Feb 21, 2022

perf: replace `std::pow(x, 0.25)` with std::sqrt(std::sqrt(x)) #1150

perf: replace `std::pow(x, 0.25)` with std::sqrt(std::sqrt(x)) #1150

codecov bot commented Feb 8, 2022 •

edited

Loading