-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boxed Vectors can crash during compaction #220
Comments
So it sounds like from list should do a proper cleanup after the initial
construction?
…On Mon, Aug 27, 2018 at 12:13 PM Ben Gamari ***@***.***> wrote:
Consider the program
import Data.Compactimport Data.Vector as V
main :: IO ()
main = do
compactVec <- compact myVector
pure ()
myVector :: Vector Int
myVector = fromList [1..10]
Currently it crashes with,
compact-vector: Data.Vector.Mutable: uninitialised element
Since the (over-sized) accumulator used by fromList is initially filled
with bottoms but never shrunk. The compactor consequently traces these
bottoms, blowing up the program.
We should try harder to shrink the accumulator down to the proper size
after we have finished building the vector
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#220>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwgzj3oxDg6PL5vb1IenDkr4Jyz2Lks5uVBqOgaJpZM4WOEgU>
.
|
Since the Can we prioritize fixing this defect? |
See #221 for a pull request addressing the |
thanks! i'll take a look
…On Tue, Sep 25, 2018 at 12:02 PM recursion-ninja ***@***.***> wrote:
See #221 <#221> for a pull request
addressing the traverse function.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#220 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwnwhv36JeCYSnkvAAdLQH6rql4nlks5uemG6gaJpZM4WOEgU>
.
|
…oxed vectors. (gh pr #221, fixes bug #220) previously traverse used unknown size fromList rather fromListN for constructing Boxed vectors. In the presence of compact regions the implementation strategy for fromList results in program crashes. Now traverse on Boxed vectors uses the input vector size for constructing the result vector.
…oxed vectors. (gh pr #221, fixes bug #220) previously traverse used unknown size fromList rather fromListN for constructing Boxed vectors. In the presence of compact regions the implementation strategy for fromList results in program crashes. Now traverse on Boxed vectors uses the input vector size for constructing the result vector.
this should be fixed now |
The commit ccf2260 doesn't fix the problem with compaction. The crash is still happening. For example, this program (ghc-8.4.4, vector-0.12.0.2): module Main where
import qualified Data.Vector as V
import GHC.Compact
main :: IO ()
main = do
c <- compact $ V.fromList [1, 2, 3]
-- c <- compact $ V.fromListN 3 [1,2,3] -- OK
-- c <- compact $ V.force $ V.fromList [1,2,3] -- OK
-- c <- compact $ V.fromList $ replicate (2^3) 1 -- OK
-- c <- compact $ V.fromList $ replicate (2^5) 1 -- OK
-- c <- compact $ V.fromList $ replicate (2^10) 1 -- OK
-- c <- compact $ V.fromList $ replicate (2^3+1) 1 -- crashes
-- c <- compact $ V.fromList $ replicate (2^5+1) 1 -- crashes
-- c <- compact $ V.fromList $ replicate (2^10+1) 1 -- crashes
pure () crashes with
It doesn't crash if
The problem is that the compaction of a
vector/Data/Vector/Fusion/Bundle/Monadic.hs Lines 1003 to 1005 in 65cfe82
as a result, the underlying Array grows exponentially (2x):vector/Data/Vector/Generic/Mutable.hs Lines 385 to 393 in cc06420
and new elements are filled with error "Data.Vector.Mutable: uninitialised element" :Lines 107 to 110 in cc06420
|
Darn. Hrmm. There’s not really a good way to fix fromlist that doesn’t hurt
some users. I suppose I just fixed the Issue with traverse!
At some level , I guess this is also an issue with compact heaps
themselves. It seems like a deep copy is needed in general to prepare for
compact heaps when there’s any hidden undefineds.
…On Mon, Dec 10, 2018 at 7:54 AM Alexey Kiryushin ***@***.***> wrote:
The commit ccf2260
<ccf2260>
doesn't fix the problem with compaction. The crash is still happening.
For example, this program (ghc-8.4.4, vector-0.12.0.2):
module Main where
import qualified Data.Vector as Vimport GHC.Compact
main :: IO ()
main = do
c <- compact $ V.fromList [1, 2, 3]
-- c <- compact $ V.fromListN 3 [1,2,3] -- OK
-- c <- compact $ V.force $ V.fromList [1,2,3] -- OK
-- c <- compact $ V.fromList $ replicate (2^3) 1 -- OK
-- c <- compact $ V.fromList $ replicate (2^5) 1 -- OK
-- c <- compact $ V.fromList $ replicate (2^10) 1 -- OK
-- c <- compact $ V.fromList $ replicate (2^3+1) 1 -- crashes
-- c <- compact $ V.fromList $ replicate (2^5+1) 1 -- crashes
-- c <- compact $ V.fromList $ replicate (2^10+1) 1 -- crashes
pure ()
crashes with uninitialised element exception:
*** Exception: Data.Vector.Mutable: uninitialised element
CallStack (from HasCallStack):
error, called at ./Data/Vector/Mutable.hs:188:17 in vector-0.12.0.2-4IpdnxtqTfNJ9xEZNSAM2c:Data.Vector.Mutable
It doesn't crash if
- a vector is created using V.fromListN function
- V.force function is called
- the length of a vector is a power of 2
The problem is that the compaction of a Vector causes an underlying boxed
Array to be fully evaluated. This Array may contain bottoms.
V.fromList function assumes that the maximum size of the stream is
unknown:
https://github.com/haskell/vector/blob/65cfe828f5bddf59fd5aaaead96c5ad45ecd7a8d/Data/Vector/Fusion/Bundle/Monadic.hs#L1003-L1005
as a result, the underlying Array grows exponentially (2x):
https://github.com/haskell/vector/blob/cc06420eaa597b85ffd8ded9f82aac1a3fc02c18/Data/Vector/Generic/Mutable.hs#L385-L393
and new elements are filled with error "Data.Vector.Mutable:
uninitialised element":
https://github.com/haskell/vector/blob/cc06420eaa597b85ffd8ded9f82aac1a3fc02c18/Data/Vector/Mutable.hs#L107-L110
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#220 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwvS_S7cP64wL4oWnimeouVXYmlQ3ks5u3llrgaJpZM4WOEgU>
.
|
There’s certainly going to be ways this can be improved. But Compact heaps
have a lot of other gotchas too. Like no closures. And vector certainly
can’t help with that.
On Mon, Dec 10, 2018 at 10:25 AM Carter Schonwald <
[email protected]> wrote:
… Darn. Hrmm. There’s not really a good way to fix fromlist that doesn’t
hurt some users. I suppose I just fixed the Issue with traverse!
At some level , I guess this is also an issue with compact heaps
themselves. It seems like a deep copy is needed in general to prepare for
compact heaps when there’s any hidden undefineds.
On Mon, Dec 10, 2018 at 7:54 AM Alexey Kiryushin ***@***.***>
wrote:
> The commit ccf2260
> <ccf2260>
> doesn't fix the problem with compaction. The crash is still happening.
>
> For example, this program (ghc-8.4.4, vector-0.12.0.2):
>
> module Main where
> import qualified Data.Vector as Vimport GHC.Compact
> main :: IO ()
> main = do
> c <- compact $ V.fromList [1, 2, 3]
> -- c <- compact $ V.fromListN 3 [1,2,3] -- OK
> -- c <- compact $ V.force $ V.fromList [1,2,3] -- OK
> -- c <- compact $ V.fromList $ replicate (2^3) 1 -- OK
> -- c <- compact $ V.fromList $ replicate (2^5) 1 -- OK
> -- c <- compact $ V.fromList $ replicate (2^10) 1 -- OK
> -- c <- compact $ V.fromList $ replicate (2^3+1) 1 -- crashes
> -- c <- compact $ V.fromList $ replicate (2^5+1) 1 -- crashes
> -- c <- compact $ V.fromList $ replicate (2^10+1) 1 -- crashes
> pure ()
>
> crashes with uninitialised element exception:
>
> *** Exception: Data.Vector.Mutable: uninitialised element
> CallStack (from HasCallStack):
> error, called at ./Data/Vector/Mutable.hs:188:17 in vector-0.12.0.2-4IpdnxtqTfNJ9xEZNSAM2c:Data.Vector.Mutable
>
> It doesn't crash if
>
> - a vector is created using V.fromListN function
> - V.force function is called
> - the length of a vector is a power of 2
>
> The problem is that the compaction of a Vector causes an underlying
> boxed Array to be fully evaluated. This Array may contain bottoms.
>
> V.fromList function assumes that the maximum size of the stream is
> unknown:
>
> https://github.com/haskell/vector/blob/65cfe828f5bddf59fd5aaaead96c5ad45ecd7a8d/Data/Vector/Fusion/Bundle/Monadic.hs#L1003-L1005
> as a result, the underlying Array grows exponentially (2x):
>
> https://github.com/haskell/vector/blob/cc06420eaa597b85ffd8ded9f82aac1a3fc02c18/Data/Vector/Generic/Mutable.hs#L385-L393
> and new elements are filled with error "Data.Vector.Mutable:
> uninitialised element":
>
> https://github.com/haskell/vector/blob/cc06420eaa597b85ffd8ded9f82aac1a3fc02c18/Data/Vector/Mutable.hs#L107-L110
>
> —
> You are receiving this because you modified the open/close state.
>
>
> Reply to this email directly, view it on GitHub
> <#220 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAAQwvS_S7cP64wL4oWnimeouVXYmlQ3ks5u3llrgaJpZM4WOEgU>
> .
>
|
I think the biggest problem here is an unhelpful error message: "Data.Vector.Mutable: uninitialised element" (the only way to understand it is to inspect the code of For example, when I try to compact a function, the error message is pretty clear:
The error message from |
Oh. That’s a good idea. I’m more than happy to do a bug fix for that.
One concern I have is : is the error coming from the compaction code or
from the error closures in vector? It sounds like it might be the former ?
Your example error message makes me think it’s coming from the compaction
code ..
If so it’s a ghc issue and the issue thus would be improved error reporting
on the ghc side for compaction failures.
…On Mon, Dec 10, 2018 at 2:15 PM Alexey Kiryushin ***@***.***> wrote:
I think the biggest problem here is an unhelpful error message:
"Data.Vector.Mutable: uninitialised element" (the only way to understand it
is to inspect the code of vector library).
For example, when I try to compact a function, the error message is pretty
clear:
λ compact (\x -> x)
*** Exception: compaction failed: cannot compact functions
The error message from vector library can be more descriptive, something
like this: "Data.Vector.Mutable: uninitialised element. Direct access to
the underlying Array is an unsafe operation. Use the force function to
remove uninitialised elements from the Array."
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#220 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAQwsWRoR937PQLr0Dfd247NOvlfwz5ks5u3rLCgaJpZM4WOEgU>
.
|
The error "Data.Vector.Mutable: uninitialised element" is coming from the error closure in vector (from an element in an The
|
…ement' error message Compaction of an immutable vector may crash with the confusing 'uninitialised element' error message (haskell#220). The workaround is to use the 'force' function.
This issue is not fixed! We hit this issue again. It took a week of developer time to track down the source. it was a call to
|
I worry that any cleanup pass for fromlist might mess with how the vector
fusion framework works. I’ll have to think about though and it might be
totally fine
For now: don’t use fromList! Use fromListN! :)
And or do a deep copy before you compact.
I worry that any fix for compact heap specific issues will in general
penalize all other code (the double allocation issue ) or have unforeseen
complications elsewhere in the fusion framework of vector.
I’m open to design approaches that make this easier to resolve , but the
moment it complicates vector / regresses performance for normal vector
usages in general I regard it as a using compact heaps ghc issue.
At a certain level this problem is about compact heaps not supporting
anything that’s not first order. So perhaps we should also think about how
to improve compact heap tooling in ghc as another path forward to push
along?
Did you try using debug symbols on Linux to track down the issue and or
profiling build stack traces ?
…On Fri, Jul 5, 2019 at 10:26 AM recursion-ninja ***@***.***> wrote:
We hit this issue again. It took a week of developer time to track down
the source. it was a call to fromList that was later uses as an argument
to compact.
Is it possible for fromList to correctly clean up after the dynamic
allocation?
Is it possible for a more descriptive error message to be raised when this
occurs?
Is it possible for the documentation on Hackage to draw explicit attention
to this defect and discourage the use of fromList in code where you might
invoke compact or the consumer of your library might invoke compact?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#220?email_source=notifications&email_token=AAABBQTSNSEJ6Y7BMXU6WNLP55KY5A5CNFSM4FRYJAKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZJU2ZY#issuecomment-508775783>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAABBQSHYQYEACV4TJ7GVVDP55KY5ANCNFSM4FRYJAKA>
.
|
I thought about it some more: I’m open to changing fromList to doing a
length calculation and then doing from list k
... probably.
I’m not 100 percent confident that this actually is a good idea. I’ll need
to check to see if this could change certain vector code from running in
constant space to having a huge space blowup.
If my choices are creating huge space
Blowups for some users vs suggesting that heavy compact heap users sanitize
their data structures better , I’m going to suggest the latter.
If there do not exist such examples, were in business. But I think that’s
gonna be the blocker for addressing your problem.
On Fri, Jul 5, 2019 at 4:59 PM Carter Schonwald <[email protected]>
wrote:
… I worry that any cleanup pass for fromlist might mess with how the vector
fusion framework works. I’ll have to think about though and it might be
totally fine
For now: don’t use fromList! Use fromListN! :)
And or do a deep copy before you compact.
I worry that any fix for compact heap specific issues will in general
penalize all other code (the double allocation issue ) or have unforeseen
complications elsewhere in the fusion framework of vector.
I’m open to design approaches that make this easier to resolve , but the
moment it complicates vector / regresses performance for normal vector
usages in general I regard it as a using compact heaps ghc issue.
At a certain level this problem is about compact heaps not supporting
anything that’s not first order. So perhaps we should also think about how
to improve compact heap tooling in ghc as another path forward to push
along?
Did you try using debug symbols on Linux to track down the issue and or
profiling build stack traces ?
On Fri, Jul 5, 2019 at 10:26 AM recursion-ninja ***@***.***>
wrote:
> We hit this issue again. It took a week of developer time to track down
> the source. it was a call to fromList that was later uses as an argument
> to compact.
>
> Is it possible for fromList to correctly clean up after the dynamic
> allocation?
>
> Is it possible for a more descriptive error message to be raised when
> this occurs?
>
> Is it possible for the documentation on Hackage to draw explicit
> attention to this defect and discourage the use of fromList in code
> where you might invoke compact or the consumer of your library might
> invoke compact?
>
> —
> You are receiving this because you modified the open/close state.
>
>
> Reply to this email directly, view it on GitHub
> <#220?email_source=notifications&email_token=AAABBQTSNSEJ6Y7BMXU6WNLP55KY5A5CNFSM4FRYJAKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZJU2ZY#issuecomment-508775783>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAABBQSHYQYEACV4TJ7GVVDP55KY5ANCNFSM4FRYJAKA>
> .
>
|
One important aspect to this issue is that building with profiling does not help! In particular this error is only hit when building without profiling. I think this is due to the interaction between compact-regions, the garbage collector and stack traces but @bgamari probably has a much better understanding of why this happens. In any case this makes the need for better error messages quite important since it is not possible to get a stack trace to see where the error message was created. |
The best fix for this would be for someone to implement https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0025-resize-boxed.rst. This proposal was accepted a while ago, but no one has added the new primop for shrinking arrays yet. If |
@cartazio @andrewthad, I'm also concerned about good performance and stream fusion. I don't want correcting this interaction with compact regions to negatively impact other users if the I have a bandaid solution in place using a custom HLint rule to warn about using I don't know enough about the stream fusion internals of
I think that step 6 is what @andrewthad is suggesting as an option made available from implementing the linked GHC proposal. That is certainly the best, most efficient option if my understanding is correct. However, getting the new functionality into |
Also, can we reopen this issue? |
Ohyah , an in place shrink would be enough. We don’t need resize for that.
I think that could be done very evilly via m
…On Tue, Jul 9, 2019 at 10:21 AM Andrew Martin ***@***.***> wrote:
The best fix for this would be for someone to implement
https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0025-resize-boxed.rst.
This proposal was accepted a while ago, but no one has added the new primop
for shrinking arrays yet. If vector used this, there would be no
performance penalty, no stream-fusion breakage, and compact regions would
work correctly since they would only copy the "live" part of the array.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#220?email_source=notifications&email_token=AAABBQSJ53UAWYNXB22S5RLP6SNG3A5CNFSM4FRYJAKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZQNNSY#issuecomment-509662923>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAABBQWGVUUCBQMGHZTSCQTP6SNG3ANCNFSM4FRYJAKA>
.
|
I will look later this week to see if a cleanup is safely possible for
fromlist and how it uses the vector fusion system.
…On Tue, Jul 9, 2019 at 1:32 PM recursion-ninja ***@***.***> wrote:
Also, can we reopen this issue?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#220?email_source=notifications&email_token=AAABBQVIRSW562XNRUQ7XEDP6TDSRA5CNFSM4FRYJAKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZQ7ASI#issuecomment-509734985>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAABBQU5EHTITAMZ7GEJ623P6TDSRANCNFSM4FRYJAKA>
.
|
The real issue is that the error handling logic for the compact heap code
in the ghc rts and base libs does not do correct error handling. That it
straight up crashes is a bug and should be treated as such on the ghc side
On Tue, Jul 9, 2019 at 2:15 PM Carter Schonwald <[email protected]>
wrote:
… I will look later this week to see if a cleanup is safely possible for
fromlist and how it uses the vector fusion system.
On Tue, Jul 9, 2019 at 1:32 PM recursion-ninja ***@***.***>
wrote:
> Also, can we reopen this issue?
>
> —
> You are receiving this because you were mentioned.
>
>
> Reply to this email directly, view it on GitHub
> <#220?email_source=notifications&email_token=AAABBQVIRSW562XNRUQ7XEDP6TDSRA5CNFSM4FRYJAKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZQ7ASI#issuecomment-509734985>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAABBQU5EHTITAMZ7GEJ623P6TDSRANCNFSM4FRYJAKA>
> .
>
|
@cartazio given that the |
yes. its both? I think the compact package just wraps up the stuff in GHC proper. so its a error handling/debugugging support. crashing is fine for haskell stuff, but if its HQ owned stuff, it shouldn't break everyones debuggin. i digged into what fromList does in vector, and i think the fix (if applicable) would be pretty tricky at the vector layer -- Bundle.Monadic
-- | Convert the first @n@ elements of a list to a 'Bundle'
fromListN :: Monad m => Int -> [a] -> Bundle m v a
{-# INLINE_FUSED fromListN #-}
fromListN n xs = fromStream (S.fromListN n xs) (Max (delay_inline max n 0))
fromList :: Monad m => [a] -> Bundle m v a
{-# INLINE fromList #-}
fromList xs = unsafeFromList Unknown xs
-- | Convert a list to a 'Bundle' with the given 'Size' hint.
unsafeFromList :: Monad m => Size -> [a] -> Bundle m v a
{-# INLINE_FUSED unsafeFromList #-}
unsafeFromList sz xs = fromStream (S.fromList xs) sz
--
--- which is then used in Bundle
-- | Create a 'Bundle' from a list
fromList :: [a] -> Bundle v a
{-# INLINE fromList #-}
fromList = M.fromList
-- which is then used in Vector.Generic
-- | /O(n)/ Convert a list to a vector
fromList :: Vector v a => [a] -> v a
{-# INLINE fromList #-}
fromList = unstream . Bundle.fromList
-- whjich is then used in Vector
-- | /O(n)/ Convert a list to a vector
fromList :: [a] -> Vector a
{-# INLINE fromList #-}
fromList = G.fromList
as you can see, unstream is defined in terms of
it seems that messing with unstream interacts pretty funadmentally with some of the fusion rules. so i'd rather we first improve debugging support first |
We've just been bitten by this as well and could only figure out what's going on after many many hours of debugging. The fact that something like this can happen means compaction and |
Consider the program
Currently it crashes with,
Since the (over-sized) accumulator used by
fromList
is initially filled with bottoms but never shrunk. The compactor consequently traces these bottoms, blowing up the program.We should try harder to shrink the accumulator down to the proper size after we have finished building the vector
The text was updated successfully, but these errors were encountered: