-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use qsort to sort short ByteString #267
Conversation
I am curious why the performance of this function is worthy of attention? What is the use-case for sorting the characters of short bytestrings? |
Co-authored-by: Viktor Dukhovni <[email protected]>
I actually struggle to find an example of sorting long strings, but for short ones a common use case is checking that two words are anagrams of each other: areAnagrams xs ys = sort xs == sort ys Words are usually short, and for the range of 5-10 symbols |
Sure, when do we care about the performance of such checks? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for the performance improvements. #123 indicated that at least one person cared about this.
I can't really comment on the implementation – it has been a long time since I last used C.
There's no meaningful C code here, just a call to |
Data/ByteString/Internal.hs
Outdated
|
||
foreign import ccall unsafe "static fpstring.h fps_count" c_count | ||
:: Ptr Word8 -> CULong -> Word8 -> IO CULong | ||
:: Ptr Word8 -> CSize -> Word8 -> IO CULong |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also return a CSize
:: Ptr Word8 -> CSize -> Word8 -> IO CULong | |
:: Ptr Word8 -> CSize -> Word8 -> IO CSize |
cbits/fpstring.c
Outdated
@@ -73,7 +73,7 @@ unsigned char fps_minimum(unsigned char *p, unsigned long len) { | |||
} | |||
|
|||
/* count the number of occurences of a char in a string */ | |||
unsigned long fps_count(unsigned char *p, unsigned long len, unsigned char w) { | |||
unsigned long fps_count(unsigned char *p, size_t len, unsigned char w) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsigned long fps_count(unsigned char *p, size_t len, unsigned char w) { | |
size_t fps_count(unsigned char *p, size_t len, unsigned char w) { |
And also in the declartion of c
below.
include/fpstring.h
Outdated
void fps_intersperse(unsigned char *dest, unsigned char *from, size_t len, unsigned char c); | ||
unsigned char fps_maximum(unsigned char *p, size_t len); | ||
unsigned char fps_minimum(unsigned char *p, size_t len); | ||
unsigned long fps_count(unsigned char *p, size_t len, unsigned char w); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsigned long fps_count(unsigned char *p, size_t len, unsigned char w); | |
size_t fps_count(unsigned char *p, size_t len, unsigned char w); |
Data/ByteString/Internal.hs
Outdated
c_intersperse, -- :: Ptr Word8 -> Ptr Word8 -> CSize -> Word8 -> IO () | ||
c_maximum, -- :: Ptr Word8 -> CSize -> IO Word8 | ||
c_minimum, -- :: Ptr Word8 -> CSize -> IO Word8 | ||
c_count, -- :: Ptr Word8 -> CSize -> Word8 -> IO CInt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c_count, -- :: Ptr Word8 -> CSize -> Word8 -> IO CInt | |
c_count, -- :: Ptr Word8 -> CSize -> Word8 -> IO CSize |
This could affect all sites that don't use fromIntegral
, but that's probably already handled, because CInt
isn't a priori any particular one of the normal Haskell integral types, so the fromIntegral
should already be there.
@vdukhovni thanks, very much appreciated. It's nice to have someone knowledgeable in C :)
I do not have any production application in mind. Honestly, I just do not like #123 lingering around any more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for "bonus points" you could also update:
diff --git a/Data/ByteString/Internal.hs b/Data/ByteString/Internal.hs
index 3687603..dc98976 100644
--- a/Data/ByteString/Internal.hs
+++ b/Data/ByteString/Internal.hs
@@ -102,7 +101,7 @@ import Foreign.Storable (Storable(..))
#if MIN_VERSION_base(4,5,0) || __GLASGOW_HASKELL__ >= 703
-import Foreign.C.Types (CInt(..), CSize(..), CULong(..))
+import Foreign.C.Types (CInt(..), CSize(..))
#else
-import Foreign.C.Types (CInt, CSize, CULong)
+import Foreign.C.Types (CInt, CSize)
#endif
diff --git a/Data/ByteString/Short/Internal.hs b/Data/ByteString/Short/Internal.hs
index 3c945c7..79f1533 100644
--- a/Data/ByteString/Short/Internal.hs
+++ b/Data/ByteString/Short/Internal.hs
@@ -588,5 +588,5 @@ copyAddrToByteArray0 src dst len s =
foreign import ccall unsafe "fpstring.h fps_memcpy_offsets"
- memcpy_AddrToByteArray :: MutableByteArray# s -> CLong -> Addr# -> CLong -> CSize -> IO ()
+ memcpy_AddrToByteArray :: MutableByteArray# s -> CSize -> Addr# -> CSize -> CSize -> IO ()
foreign import ccall unsafe "string.h memcpy"
@@ -609,5 +609,5 @@ copyByteArrayToAddr0 src dst len s =
foreign import ccall unsafe "fpstring.h fps_memcpy_offsets"
- memcpy_ByteArrayToAddr :: Addr# -> CLong -> ByteArray# -> CLong -> CSize -> IO ()
+ memcpy_ByteArrayToAddr :: Addr# -> CSize -> ByteArray# -> CSize -> CSize -> IO ()
foreign import ccall unsafe "string.h memcpy"
@@ -636,6 +636,6 @@ copyByteArray# src src_off dst dst_off len s =
foreign import ccall unsafe "fpstring.h fps_memcpy_offsets"
- memcpy_ByteArray :: MutableByteArray# s -> CLong
- -> ByteArray# -> CLong -> CSize -> IO ()
+ memcpy_ByteArray :: MutableByteArray# s -> CSize
+ -> ByteArray# -> CSize -> CSize -> IO ()
#endif
diff --git a/cbits/fpstring.c b/cbits/fpstring.c
index 71dc750..165d296 100644
--- a/cbits/fpstring.c
+++ b/cbits/fpstring.c
@@ -85,6 +85,6 @@ size_t fps_count(unsigned char *p, size_t len, unsigned char w) {
We cannot construct a pointer to the interior of an unpinned ByteArray#,
except by doing an unsafe ffi call, and adjusting the pointer C-side. */
-void * fps_memcpy_offsets(void *dst, unsigned long dst_off,
- const void *src, unsigned long src_off, size_t n) {
+void * fps_memcpy_offsets(void *dst, size_t dst_off,
+ const void *src, size_t src_off, size_t n) {
return memcpy(dst + dst_off, src + src_off, n);
}
Mind you, some of that code goes away with my "drop 7.x" patch, which also drops |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But the builds are now failing. There's some sort of clong
function that produced explicit CLong
for some of these, rather than just use fromIntegral
.
Data/ByteString/Short/Internal.hs:576:38:
Couldn't match expected type `CSize' with actual type `CLong'
In the return type of a call of `clong'
In the second argument of `memcpy_AddrToByteArray', namely
`(clong dst_off)'
In the first argument of `unIO_', namely
`(memcpy_AddrToByteArray dst (clong dst_off) src 0 (csize len))'
Data/ByteString/Short/Internal.hs:597:44:
Couldn't match expected type `CSize' with actual type `CLong'
In the return type of a call of `clong'
In the fourth argument of `memcpy_ByteArrayToAddr', namely
`(clong src_off)'
In the first argument of `unIO_', namely
`(memcpy_ByteArrayToAddr dst 0 src (clong src_off) (csize len))'
Data/ByteString/Short/Internal.hs:632:30:
Couldn't match expected type `CSize' with actual type `CLong'
In the return type of a call of `clong'
In the second argument of `memcpy_ByteArray', namely
`(clong dst_off)'
In the first argument of `unsafeIOToST', namely
`(memcpy_ByteArray
dst (clong dst_off) src (clong src_off) (csize len))'
So also need: diff --git a/Data/ByteString/Short/Internal.hs b/Data/ByteString/Short/Internal.hs
index 54f77d9..47525fd 100644
--- a/Data/ByteString/Short/Internal.hs
+++ b/Data/ByteString/Short/Internal.hs
@@ -575,3 +575,3 @@ copyByteArrayToAddr# = GHC.Exts.copyByteArrayToAddr#
copyAddrToByteArray# src dst dst_off len s =
- unIO_ (memcpy_AddrToByteArray dst (clong dst_off) src 0 (csize len)) s
+ unIO_ (memcpy_AddrToByteArray dst (csize dst_off) src 0 (csize len)) s
@@ -596,3 +596,3 @@ foreign import ccall unsafe "string.h memcpy"
copyByteArrayToAddr# src src_off dst len s =
- unIO_ (memcpy_ByteArrayToAddr dst 0 src (clong src_off) (csize len)) s
+ unIO_ (memcpy_ByteArrayToAddr dst 0 src (csize src_off) (csize len)) s
@@ -619,4 +619,4 @@ unIO_ io s = case unIO io s of (# s, _ #) -> s
-clong :: Int# -> CLong
-clong i# = fromIntegral (I# i#)
+csize :: Int# -> CSize
+csize i# = fromIntegral (I# i#)
@@ -631,3 +631,3 @@ copyByteArray# src src_off dst dst_off len s =
unST_ (unsafeIOToST
- (memcpy_ByteArray dst (clong dst_off) src (clong src_off) (csize len))) s
+ (memcpy_ByteArray dst (csize dst_off) src (csize src_off) (csize len))) s
where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. By the way, this assumes that the 'source_offset' used with the bytearray copy functions is always non-negative. I think that's a correct assumption, and the function I think is always in fact used with an offset of 0. If we really wanted negative offsets, we'd need an ssize_t
and Haskell equivalent, but I do not think this is needed. Instead we could perhaps "assert >=0" in the input to "csize", and run an unoptimized build to test.
@vdukhovni thanks for thorough review! Given that |
P.S. We seem to have neglected one of the changes I suggested, but failed to notice did not get included:
Not critical, but worth doing, just to be consistent all around. |
This is similar in spirit to #123, but includes benchmarks to reproduce results. I also decided to use
qsort
fromstdlib
instead of implementing insertion sort manually, just to minimise changes.(I have no idea about C, suggestions would be highly appreciated, CC @vdukhovni)