-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfaults and "internal errors" when vector slice overflows #257
Comments
Looks to me like the bug is here: vector/Data/Vector/Internal/Check.hs Lines 147 to 151 in 46110b9
Specifically, |
This is great, how many ways can we get the slice wrong? :) Surprisingly, bug reported here is pretty hard to hit this in a Reason for this is that if a vector is constructed from a stream, or any other way really, this rewrite rule will get triggered and
So, if you put something fun like this instead into main = do
let xs = [1, 2, 3, 4, 5] :: [Int]
v = V.slice 1 (maxBound `div` 8) (V.fromList xs)
print $ V.length v You'll get another very nice issue:
But what is even greater consequence of this, is that too large of the size parameter to main = do
let xs = [1, 2, 3, 4, 5] :: [Int]
mySlice = V.slice 1 8
v <- do
mv <- MV.new $ List.length xs
M.mapM_ (uncurry (MV.write mv)) $ List.zip [0..] xs
V.freeze mv
print $ mySlice $ V.fromList xs
print $ mySlice v This results in: $ stack exec -- ghc slice.hs -O1 && ./slice
[1 of 1] Compiling Main ( slice.hs, slice.o ) [Optimisation flags changed]
Linking slice ...
[2,3,4,5]
slice: ./Data/Vector/Generic.hs:396 (slice): invalid slice (1,8,5)
CallStack (from HasCallStack):
error, called at ./Data/Vector/Internal/Check.hs:87:5 in vector-0.12.0.3-LfvlcMFJAcY18uD1Y2O5Ig:Data.Vector.Internal.Check I got a fix partially implemented for the original issue: WIP in https://github.com/haskell/vector/tree/lehins/257-fix-slice-overflow The follow up problem I described has to do with |
I guess a serious question we collectively should answer, which semantics should we keep?
or
Reason why I bring it up is because it seems the former might be more prevalent in the wild and some people might even rely on it. I am slightly leaning towards it, since it reduces number of partial functions and sort of follows vector/Data/Vector/Fusion/Stream/Monadic.hs Lines 313 to 318 in 1bb6b5d
|
Perhaps I’m slow, but could you explain what you mean in terms of start and
end coordinates and arrays?
…On Wed, Jan 29, 2020 at 10:33 PM Alexey Kuleshevich < ***@***.***> wrote:
I guess a serious question we collectively should answer, which semantics
we should keep?
slice 10 k [0..k] == [10..k]
or
slice 10 k [0..k] == error
Reason why I bring it up is because it seems the former might be more
prevalent in the wild and some people might even rely on it. I am slightly
leaning towards it, since it reduces number of partial functions and sort
of follows drop and take philosophy.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#257?email_source=notifications&email_token=AAABBQUZMABFGWA5ZECPSULRAJDA5A5CNFSM4IZ476A2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKJSHUI#issuecomment-580068305>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBQR4W6O35HO2WGJEPILRAJDA5ANCNFSM4IZ476AQ>
.
|
Do you meant should we return an empty slice vs throw an error?
On Wed, Jan 29, 2020 at 10:58 PM Carter Schonwald <
[email protected]> wrote:
… Perhaps I’m slow, but could you explain what you mean in terms of start
and end coordinates and arrays?
On Wed, Jan 29, 2020 at 10:33 PM Alexey Kuleshevich <
***@***.***> wrote:
> I guess a serious question we collectively should answer, which semantics
> we should keep?
>
> slice 10 k [0..k] == [10..k]
>
> or
>
> slice 10 k [0..k] == error
>
> Reason why I bring it up is because it seems the former might be more
> prevalent in the wild and some people might even rely on it. I am slightly
> leaning towards it, since it reduces number of partial functions and sort
> of follows drop and take philosophy.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#257?email_source=notifications&email_token=AAABBQUZMABFGWA5ZECPSULRAJDA5A5CNFSM4IZ476A2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKJSHUI#issuecomment-580068305>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAABBQR4W6O35HO2WGJEPILRAJDA5ANCNFSM4IZ476AQ>
> .
>
|
For the slow here is what I suggest. (just kidding ;) Make
Code to reproduce above output: {-# LANGUAGE LambdaCase #-}
module Main where
import Control.Exception
import Control.Monad as M
import Data.List as List
import qualified Data.Vector as V
import qualified Data.Vector.Generic as VG
import qualified Data.Vector.Primitive as VP
sliceList :: Int -> Int -> [a] -> [a]
sliceList i n xs = List.take n (List.drop i xs)
sliceVectorProposed :: VG.Vector v a => Int -> Int -> v a -> v a
sliceVectorProposed i n xs = VG.take n (VG.drop i xs)
tryErrorPrint :: Show a => String -> a -> IO ()
tryErrorPrint prefix doSlice = do
putStr prefix
try (pure $! doSlice) >>= \case
Left (ErrorCall err) -> putStrLn ('\n':err)
Right result -> print result
printSlices :: (Show a, Show b) => String -> (Int -> Int -> b -> a) -> b -> IO ()
printSlices name sliceWith xs = do
putStrLn $ replicate 50 '='
putStrLn $ "slice (" ++ name ++ "): " ++ show xs
putStrLn $ " normal: " ++ show (sliceWith 1 3 xs)
tryErrorPrint " negative ix: " (sliceWith (-2) 2 xs)
tryErrorPrint " negative size: " (sliceWith 2 (-2) xs)
tryErrorPrint " negative ix and size: " (sliceWith (-2) (-1) xs)
tryErrorPrint " too large ix: " (sliceWith 6 2 xs)
tryErrorPrint " too large size: " (sliceWith 2 6 xs)
tryErrorPrint " too large ix size: " (sliceWith 6 6 xs)
main = do
let xs = [1, 2, 3, 4, 5] :: [Int]
printSlices "List" sliceList xs
printSlices "Vector proposed" sliceVectorProposed $ V.fromList xs
printSlices "Vector.Boxed current - fused" (\i n -> V.slice i n . V.fromList) xs
printSlices "Vector.Primitive current - fused" (\i n -> VP.slice i n . VP.fromList) xs
printSlices "Vector current - unfused" V.slice (V.fromList xs) |
Of course on the other end of the spectrum is to always throw an error on invalid indices, but I suspect in order to achieve that we'll have to break fusion on slicing. The benefit would be that we would get the behavior that a lot of folks expect (but don't always get), namely this one:
I am rarely in favor of partial functions, therefore I lean towards fixing the arguments to |
There’s two legs to this conversation around the semantics of a fix
1) what’s the best possible array slice api. I’ve some strong opinions on
this I’ll write down in a day or so. Juggling a few other commitments
2) making sure users benefit from great optimization as much as possible.
My stance is that if we find fusion tools hindering our intent, then we
should think about ways to change ghc so we can make a better vector. I
started some rudimentary experiments in December. But we should have those
technical barriers we hit be turned into optimization tech challenges we
collectively change our tools to address.
…On Thu, Jan 30, 2020 at 9:49 AM Alexey Kuleshevich ***@***.***> wrote:
Of course on the other end of the spectrum is to always throw an error on
invalid indices, but I suspect in order to achieve that we'll have to break
fusion on slicing. The benefit would be that we would get the behavior that
a lot of folks expect (but don't always get), namely this one:
==================================================
sliceList(Vector current - unfused): [1,2,3,4,5]
normal: [2,3,4]
negative ix:
./Data/Vector/Generic.hs:396 (slice): invalid slice (-2,2,5)
negative size:
./Data/Vector/Generic.hs:396 (slice): invalid slice (2,-2,5)
negative ix and size:
./Data/Vector/Generic.hs:396 (slice): invalid slice (-2,-1,5)
too large ix:
./Data/Vector/Generic.hs:396 (slice): invalid slice (6,2,5)
too large size:
./Data/Vector/Generic.hs:396 (slice): invalid slice (2,6,5)
too large ix size:
./Data/Vector/Generic.hs:396 (slice): invalid slice (6,6,5)
I am rarely in favor of partial functions, therefore I lean towards fixing
the arguments to slice functions so they don't fail. All this
conversation applies to mutable variants as well, but naturally not for the
streaming ones.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#257?email_source=notifications&email_token=AAABBQURCE6ECG6NYEODXOLRALSIXA5CNFSM4IZ476A2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKLIBYQ#issuecomment-580288738>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBQXWM3PRD3F2P2ZKYYDRALSIXANCNFSM4IZ476AQ>
.
|
for now / cutting a release: |
Sounds good. I'll then turn off fusion of the |
i mean, it'd be great if we could hae fusion AND nice things, but that might be more involved? |
looking at the underlying code,
MVector codes never fuse, that fusion rule doesn't preclude
working.... if we somemthing something? |
by which i mean, we want slice to check correctly whether or not it gets rewritten away ? |
hrmm, i need to read through this all more |
If it gets rewritten it can't be checked. In case you wanna have more fun with it like I did last night here are my notes:
|
* Streaming causing bounds to not be checked, thus not failing slicing. Which differed from the semantics of `slice` when compiled with -O0 * Out of memory explosion when size supplied to `slice` is too high
* Streaming causing bounds to not be checked, thus not failing slicing. Which differed from the semantics of `slice` when compiled with -O0 * Out of memory explosion when size supplied to `slice` is too high
I reported this for GHC first. See https://gitlab.haskell.org/ghc/ghc/issues/17233 for full details (could be copied here if clearly a bug in vector). May be related to #188?
The text was updated successfully, but these errors were encountered: