-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow the result of unsafeCreate to be unboxed #580
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to write a test for Core using tasty-inspection-testing
?
I don't have any experience working with it yet, but I'd guess so. The expectations we have about unboxing will depend on the ghc version used. |
https://github.com/Bodigrim/tasty-inspection-testing#readme contains some examples how to write inspection tests. It's fine to limit them to GHC 9.0+ only. |
Cabal isn't happy when I add |
Ah, right, I had a good experience with Can you describe what kind of manual test can check that result of |
@clyring could you please rebase to get a clean CI run? |
Data/ByteString/Internal/Type.hs
Outdated
(# s1, addr1# #) -> (# s1, ForeignPtr addr1# guts #) | ||
|
||
unsafeDupablePerformIO :: IO a -> a | ||
-- Why does this exist? As of base-4.18.0.0, the version of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change first appear in base-4.18
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There has never been good unboxing through unsafeDupablePerformIO
:
- Before 9.0.1,
runRW#
is too opaque for ghc to ever unbox the result ofunsafeDupablePerformIO
. - In
ghc-9.0.1
/base-4.15.0.0
, ghc starts pushing strict contexts intorunRW#
and can therefore sometimes unbox the result of anunsafeDupablePerformIO
call if it gets inlined into a strict context that discards the box. But, the CPR analysis used for worker/wrapper does not know aboutrunRW#
so this unboxing happens only through inlining, and not through worker/wrapper. - In all existing later versions (including 9.0.2 and 9.2.1 through 9.6.2),
- CPR analysis knows that the result of a
runRW#
call can be unboxed, so worker/wrapper improves in the presence ofrunRW#
, but - the definition of
unsafeDupablePerformIO
inbase
gets alazy
which prevents unboxing from happening for this function.
- CPR analysis knows that the result of a
Benchmarks of strict
|
@Bodigrim I have tried and failed to reproduce any scanl/scanr regressions on my machine; I've tried with all of ghc 9.2.8, 9.4.5, and 9.6.2. Are these consistent for you? |
The measurements above are for GHC 9.6.1 + aarch64. Indeed there are no regressions for GHC 9.2.7 or GHC 9.4.5. I tried GHC 9.6.2 and it seems to fix the issue: I still see some changes for short bytestrings, but no changes for longer ones, so it's likely to be just noise. LGTM! |
* Allow the result of unsafeCreate to be unboxed * Fix build with old versions of ghc * Add hackage source link for referenced Note * Improvement documentation for the new functions * Publicly export deferForeignPtrAvailability * Add convenience function `mkDeferredByteString` * remove extra '
By removing the
lazy
from the implementation ofunsafeDupablePerformIO
, this patch allows the simplifier to unbox intermediateStrictByteString
s in many more situations.This improved freedom for the simplifier comes at a price: It creates opportunities for the simplifier to perform reads at the newly-unboxed addresses before the buffer has actually been initialized. To prevent this from causing problems, the
deferForeignPtrAvailability
function is introduced. (I'm not entirely sure this provides enough protection in a multi-threaded context on all architectures. Maybe @bgamari can comment? But it's at least no more unsafe than theShortByteString
stuff already is.)Supersedes #466.