-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added upper case hex builder support #117
Conversation
primMapByteStringBounded would refuse to write to a buffer that had just enough space left to fit an item, incorrectly thinking it had a buffer overflow. Then asking for a new buffer of at least the size to cover the item could result in getting a buffer of that size, causing an infinite loop.
cbits/itoa.c
Outdated
}; | ||
|
||
// unsigned long ints (64 bit words) | ||
char* _hs_bytestring_long_long_uint_hex_upper (long long unsigned int x, char* buf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be safer to use uint64_t
from <stdint.h>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint64_t
means that it requires C99 (or later) and a byte oriented target that has a type that is exactly 64 bits, otherwise it may not exist.
Then there is the issue that it should match the Haskell side, and while there is a CLLong
type (which represents a C long long
type), there seems to be nothing in Foreign.C.Types
that corresponds to a uint64_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth GHC itself requires C99. I believe the appropriate type to use on the Haskell side would be Word64
. IFAIK, the Foreign.C.Types
types exist simply to mirror the rather vague ANSI C types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the Haskell wiki and it seems you are right, it should be able to marshal a Word64
. I will update the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very clean although some of it goes well over my head.
Since this PR includes a bug fix, it's a bit unfortunate that it has been lying around for such a long time. It would be great to release this fix soon!
Regarding the build failures with 7.0 and 7.2, are these fundamental incompatibilities or does this need just a bit more CPP or the like?
@bgamari Are you happy with the C code in itoa.c
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I too could use the combinators with the user supplied widths, so that e.g. I could get the low 16 bits of a 32-bit word without making a copy (though aggressive inlining often eliminates the copies, it can be tricky to get right). To that end I'd also like to a see shift count, so that I could get the upper-16 bits of a 32-bit word, or similar. If we're generalizing bit-range extraction, why just the low bits?
Otherwise, and overall, I think the PR is useful. I just think that perhaps the Int
width is likely better a Word8
, and perhaps <stdint.h>
is still not sufficiently portable...
>$< pairF word32HexFixed word32HexFixed | ||
{-# INLINE encodeWordHexFixedWidth #-} | ||
encodeWordHexFixedWidth :: forall a. (Storable a, Integral a) => Int -> FixedPrim a | ||
encodeWordHexFixedWidth width = fixedPrim width $ c_uint32_fixed_hex (CInt (fromIntegral width)) . fromIntegral |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the width here be unsigned? (e.g. Word8
rather than Int
). Nasty surprises likely await anyone who passes a negative value, and also fixed widths above 255 bytes don't seem to have much utility. So I think there's a case for the width being a Word8
, but a Word
would also be OK(ish).
@@ -4,13 +4,15 @@ | |||
// inspired by: http://www.jb.man.ac.uk/~slowe/cpp/itoa.html // | |||
/////////////////////////////////////////////////////////////// | |||
|
|||
#include <stdio.h> | |||
#include <stdint.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, sufficiently old, but still extant versions of Windows (really Visual Studio) don't have stdint.h. This may warrant confirmation wrt. to the supported Windows releases and toolchains for Haskell.
Just a couple of comments. The reason why I needed these upper case hex characters is that I emit intel-hex and Motorola S-records. These are old formats. At least intel-hex demands upper case hex characters, some readers will not accept lower case hex digits. Now that I also generate code for PDP-11, I get to think that maybe I want octal numbers as well. Those old people (even older than me) probably want to see list files in octal, LOL So my question is, if I add another PR which handles octal numbers, will it get accepted or warehoused? |
Well, you can of course just use either of fixedPrim :: Int -> (a -> Ptr Word8 -> IO ()) -> FixedPrim a
boudedPrim :: Int -> (a -> Ptr Word8 -> IO (Ptr Word8)) -> BoundedPrim a -- [sic] to craft octal encoders for your own needs, without adding them to this library. And while I'm not in principle opposed to seeing octal encodings merged, perhaps the benefit (few use-cases) does not justify the cost (more code to maintain and larger library code size). As for the |
Can you fix the CI issues? It appears that If this is to be merged at this stage, it would probably have to go into in without new features (e.g. octal). |
As I wrote in #112, I am not keen to add 25 new public functions for a feature with a narrow audience. If this is still wanted, I encourage to come up with a more concise API, which I'll be happy to review in a timely manner. |
Closing as stalled and outdated. Feel free to reopen, if you come up with a solution, adding less new public functions. |
I fixed a bug in
Data.ByteString.Builder.Prim.primMapByteStringBounded
which caused an infinite loop due to incorrect test for buffer overflow. There are two other locations in this module that also make this kind of test, but they already do it right (using<=
).Support for upper-case hexadecimal builders is the main thing here. I went for doing it at C level and retired the Base16 module, which makes it more consistent.
There are also some new builders for writing a specified number of hexadecimal digits (suffix
Width
). These allows for writing arbitrary number of hex digits based on values either in 32 or 64 bit types. I needed them myself, as I have odd formatted hex data that does not follow the built-in Haskell types.The test suite has been updated with more tests for the new hex builders.