-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
freeze
sometimes performs better than unsafeFreeze
#409
Comments
I can't exactly reproduce this problem on my machine, much will depend on llvm and ghc versions (please report yours). When compiled with llvm I see that
However, when compiled natively the result is even better than llvm and unsafe version is almost double the performance:
|
I have added version information to my original post. I tried the native code generator and I get the following:
And with LLVM:
So the native code generator's I am now even more puzzled. What could be causing such behavior and what can I do about it? I need LLVM as it performs much better than the native code generator on other parts of the software. |
The difference between You may want to check alignment of the loop. One thing that can help here is marking |
I'd suggest to report this issue on ghc issue tracker however. There is nothing can be done in vector about this issue FYI, just for my own sanity checked performance with |
@julianbrunner Can you try this implementation instead. It fixed the issue for me: go :: Int -> Vector Double
go n = runST $ do
pv <- unsafeNew n
let w 0 !i = return ()
w n !i = unsafeWrite pv i 42.0 >> w (n - 1) (i + 1)
w n 0
unsafeFreeze pv |
I can reproduce this with GHC8.10 and LLVM9. Results are summarized in table below.
Performance with strict accumulator is essentially same for both NCG and LLVM. For lazy accumulator LLVM generates something bad. This I think bug in LLVM code generator and reason for poor performance could only be found by inspecting assembly. |
Yes, this fixes the issue. I am still quite puzzled as to what is going on though. I might investigate some more when I get the chance. |
@julianbrunner I suggest you open an issue on ghc issue tracker about this. It looks like ghc strictness analyzer is not interacting well with llvm, which I am more than sure is not Vector/ByteArray specific and can pop up in other scenarios. I am closing this issue, as there is nothing we can do in vector to alleviate this, beside suggesting users to be more strict |
|
I have encountered a situation where
freeze
is actually faster thanunsafeFreeze
. Minimal example:Compiled with
ghc -O2 -fllvm freeze.hs
, this gives me a mean running time of 985.8μs when usingunsafeFreeze
and 642.8μs when usingfreeze
. How can this be? Shouldn'tunsafeFreeze
always be faster since it performs strictly less work?I am using ghc 8.10.5 and llvm 12.0.1.
The text was updated successfully, but these errors were encountered: