-
Notifications
You must be signed in to change notification settings - Fork 100
STFL Parser error near '' with musl-libc #364
Comments
Why not at runtime, see if iconv_open fails when //TRANSLIT is used and then retry without it? Also, always omit //TRANSLIT when dest charset is UTF-8 since it's meaningless in that case (UTF-8 can represent everything). |
The reason musl does not support //TRANSLIT, btw, is that its contrary to the requirements of the standard. POSIX specifies that an argument to iconv_open containing / is to be treated as a pathname to a charmap file. Implementations are not required to support charmap files (musl doesn't) but in this case the open needs to fail to indicate that they aren't supported. glibc is also wrong to produce an error when //TRANSLIT is omitted. POSIX requires iconv to perform an implementation-defined conversion of characters that are not present in the dest charset. Replacing them with ?'s or transliterating them would be conforming options. |
Because the behaviour doesn't depend on information that's only available in runtime. Implementing this as a runtime would be just bad taste, IMHO. Thanks for the notes on how different libc implementations should and do behave—it's the first time I deal with this stuff. I'll edit the issue to say "check if |
My view is that runtime checks/fallbacks are not bad/dirty/wrong for behavior that can't be statically determined without execution. The only way to do build-time checks for them would be to hard-code assumptions about specific targets (bad/dirty/wrong) or execute a test at build time (precluding cross compiling). The runtime fallback is simple and inexpensive. |
Janko also adds: "if musl or something else decides to implement TRANSLIT, you wouldn't have to recompile newsbeuter", which is a very good argument. |
Yes, it is. And if that happens, if you did a test at build-time with a new version of libc and found support for //TRANSLIT, the resulting binary would run (because there's no missing symbol) but fail to work right with older libc. This is another reason why, in general, it's best to do runtime checks for things you can't test just by compile/link tests. Another big class to which that applies is checking for behavior that depends on the kernel; someone who compiled a program on a newer kernel might still end up running it on an older one. |
Okay, you persuaded me. I'm going to rewrite the plan to implement runtime check instead. |
No, I'd rather keep the original text as context for the discussion. Let's have new plan here instead. Implementation might proceed as follows:
|
Some iconv() implementations don't support transliteration. When iconv with //TRANSLIT is first attempted and fails, try again without it, and remember the result. Fixes akrennmair#364
Janko on IRC reported this.
When converting forms to STFL's internal widechar representation, we specify
//TRANSLIT
to ensure that as many users as possible see something meaningful in place of characters that their locale encoding can't represent. However, musl doesn't implement transliteration, causing Newsbeuter to crash withSTFL Parser error near ''
. (The call tostfl_ipool_towc
returns empty string.)Three solutions were proposed:
//TRANSLIT
altogether. The world moved on, new systems use UTF-8 (but we didn't find any statistics on that).#ifdef
s to figure out what libc we're dealing with, decide to include//TRANSLIT
based on that. There's a slight problem there: musl doesn't define any macro. We'll have to do it autoconf-style: compile a testing binary, run it, look at the exit code and figure out things from there.//TRANSLIT
results instfl_ipool_towc
returning empty string. If so, don't use//TRANSLIT
anymore.Since the problem depends on libc, i.e. something that doesn't change at runtime, the third option is out.
Out of the first two, the first is the best maintainability-wise, but will break things for people who run Newsbeuter on older systems with non-Unicode locales. Thus it looks like we'll have to go with option 2.
Implementation might proceed as follows:
//TRANSLIT
is necessary (with glibc it's used to avoid errors on unmappable chars). The result should be returned via exit code.config.sh
to compile and run the test program. Depending on the result, some flag shall be put intoconfig.mk
(see howDEFINES
is used incheck_ssl_implementation()
).//TRANSLIT
is used so that//TRANSLIT
is excluded if there's no support from libraries.The text was updated successfully, but these errors were encountered: