Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix reinterpret(Char, ::UInt32) for "unnatural" values (fix #29181) #29192

Merged
merged 1 commit into from
Sep 17, 2018

Conversation

StefanKarpinski
Copy link
Member

No description provided.

src/datatype.c Outdated
@@ -689,8 +689,9 @@ static jl_value_t *boxed_char_cache[128];
JL_DLLEXPORT jl_value_t *jl_box_char(uint32_t x)
{
jl_ptls_t ptls = jl_get_ptls_states();
if (0 < (int32_t)x)
return boxed_char_cache[x >> 24];
uint32_t u = __builtin_bswap32(x);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have bswap_32 defined for this for a few platforms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for a few platforms" sounds unlikely to work everywhere. @yuyichao, @Keno, @vtjnash, do any of you have advice on how to spell this operation correctly and portably?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bswap_32 is the correct version since it takes care of the difference between clang, ICC, MSVC, or not having the builtin available at all.

see

#if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || __GNUC_MINOR__ >= 8))
#define bswap_16(x) __builtin_bswap16(x)
#define bswap_32(x) __builtin_bswap32(x)
#define bswap_64(x) __builtin_bswap64(x)
#elif defined(_MSC_VER)
#define bswap_16(x) _byteswap_ushort(x)
#define bswap_32(x) _byteswap_ulong(x)
#define bswap_64(x) _byteswap_uint64(x)
#elif defined(__INTEL_COMPILER)
#define bswap_16(x) _bswap16(x)
#define bswap_32(x) _bswap(x)
#define bswap_64(x) _bswap64(x)
#else
#define bswap_16(x) (((x) & 0x00ff) << 8 | ((x) & 0xff00) >> 8)
#define bswap_32(x) \
((((x) & 0xff000000) >> 24) | (((x) & 0x00ff0000) >> 8) | \
(((x) & 0x0000ff00) << 8) | (((x) & 0x000000ff) << 24))
STATIC_INLINE uint64_t ByteSwap64(uint64_t x)
{
uint32_t high = (uint32_t) (x >> 32);
uint32_t low = (uint32_t) x;
return ((uint64_t) bswap_32 (high)) |
(((uint64_t) bswap_32 (low)) << 32);
}
#define bswap_64(x) ByteSwap64(x)
#endif

@JeffBezanson JeffBezanson added bugfix This change fixes an existing bug strings "Strings!" labels Sep 14, 2018
This code was assuming that character values only have bit-patterns
that decoding a string can produce, but of course `reinterpret` can
produce any bit pattern in a `Char` whatsoever. The fix doesn't use
that assumption and only uses the cache for actual ASCII characters.
@StefanKarpinski
Copy link
Member Author

We're seeing a lot of httpbin flakiness again. Time to switch, @staticfloat?

@staticfloat
Copy link
Member

@StefanKarpinski your wish is my command: #29228

@StefanKarpinski StefanKarpinski merged commit 88f74b7 into master Sep 17, 2018
@StefanKarpinski StefanKarpinski deleted the sk/char-reinterpret branch September 17, 2018 20:28
KristofferC pushed a commit that referenced this pull request Oct 6, 2018
…29192)

This code was assuming that character values only have bit-patterns
that decoding a string can produce, but of course `reinterpret` can
produce any bit pattern in a `Char` whatsoever. The fix doesn't use
that assumption and only uses the cache for actual ASCII characters.

(cherry picked from commit 88f74b7)
KristofferC pushed a commit that referenced this pull request Oct 10, 2018
…29192)

This code was assuming that character values only have bit-patterns
that decoding a string can produce, but of course `reinterpret` can
produce any bit pattern in a `Char` whatsoever. The fix doesn't use
that assumption and only uses the cache for actual ASCII characters.

(cherry picked from commit 88f74b7)
KristofferC pushed a commit that referenced this pull request Feb 11, 2019
…29192)

This code was assuming that character values only have bit-patterns
that decoding a string can produce, but of course `reinterpret` can
produce any bit pattern in a `Char` whatsoever. The fix doesn't use
that assumption and only uses the cache for actual ASCII characters.

(cherry picked from commit 88f74b7)
KristofferC pushed a commit that referenced this pull request Feb 20, 2020
…29192)

This code was assuming that character values only have bit-patterns
that decoding a string can produce, but of course `reinterpret` can
produce any bit pattern in a `Char` whatsoever. The fix doesn't use
that assumption and only uses the cache for actual ASCII characters.

(cherry picked from commit 88f74b7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This change fixes an existing bug strings "Strings!"
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants