-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
byteArrayToByteString, and byteStringToByteArray? #186
Comments
Isn't this basically the conversion to/from ShortByteString? If I understand it correctly,
(slightly related, in the past I also used http://hackage.haskell.org/package/bytestring-plain-0.1.0.2/docs/Data-ByteString-Plain.html before there was ShortByteString which had a different cost-model, but since ShortByteString I almost always use that one -- probably in the very use-cases you seem to prefer ByteArray ) |
I think @hvr is right, one can employ import Data.ByteString.Short.Internal (ShortByteString(..), fromShort, toShort)
byteArrayToByteString :: ByteArray -> ByteString
byteArrayToByteString (ByteArray b#) = fromShort (SBS b#)
byteStringToByteArray :: ByteString -> ByteArray
byteStringToByteArray bs = let SBS b# = toShort bs in ByteArray b# |
Revisiting this, I believe it is possible to do this without copying, if the ForeignPtr allows it. Some example code: {-# language MagicHash, UnboxedTuples #-}
import Data.ByteString.Internal (ByteString(..))
import Data.ByteString.Short (ShortByteString(..), toShort)
import GHC.ForeignPtr (ForeignPtr(..), ForeignPtrContents(..))
import GHC.Exts
( MutableByteArray#, ByteArray#, Addr#, Int(..), RealWorld,
mutableByteArrayContents#, eqAddr#, isTrue#, runRW#, unsafeFreezeByteArray#,
getSizeofMutableByteArray#, (==#)
)
import Data.Primitive.ByteArray(ByteArray(..))
import qualified Data.ByteString as BS
import qualified Data.List as List
import qualified Data.Char as Char
import Data.Word (Word8)
import GHC.Exts (toList)
byteStringToByteArray :: ByteString -> ByteArray
byteStringToByteArray b@(BS (ForeignPtr addr# contents) (I# len#)) = case contents of
MallocPtr marr# _ -> ByteArray (go marr# addr#)
PlainPtr marr# -> ByteArray (go marr# addr#)
_ -> ByteArray (byteStringToByteArrayCopy b)
where
go :: MutableByteArray# RealWorld -> Addr# -> ByteArray#
go marr# addr# =
let marrAddr# = mutableByteArrayContents# marr#
in if isTrue# (eqAddr# addr# marrAddr#)
then runRW#
(\s0 -> case getSizeofMutableByteArray# marr# s0 of
(# s1, marrLen# #) ->
if isTrue# (marrLen# ==# len#)
then case unsafeFreezeByteArray# marr# s1 of
(# _, r #) -> r
else byteStringToByteArrayCopy b
)
else byteStringToByteArrayCopy b
byteStringToByteArrayCopy :: ByteString -> ByteArray#
byteStringToByteArrayCopy s = case toShort s of { SBS arr -> arr; }
main :: IO ()
main = do
let b = BS.pack [0x20..0x7e]
let r = byteStringToByteArray b
let byteToChar byte = Char.chr (fromIntegral @Word8 @Int byte)
let bChars = List.map byteToChar (BS.unpack b)
let rChars = List.map byteToChar (toList r)
print (bChars == rChars) This can just return a ShortByteString instead if ByteString doesn't want to concern itself with lifted ByteArray. |
#547 does some similar stuff specifically to better support IO operations. As discussed there, I have serious reservations about the idea because of its surprising interaction with compact regions. But there are several tricky things about it to get right about the implementation, and I suppose it is a rather fundamental operation. Perhaps a version of it can live in |
i find myself using the following more often now:
is it possible that these could be added somewhere? i work with ByteArray a lot more than i do bytestring, but there are places where ByteString is necessary because of other apis (aeson, binary, etc.)
The text was updated successfully, but these errors were encountered: