-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling ISO-8859-1 characters #157
Comments
You can do most of this using the buffer builtin. http://nodejs.org/api/buffer.html#buffer_new_buffer_str_encoding You'll need to determine somehow if the character set isn't utf8 chars. That'll have to be up to you. |
Well, the earliest moment that I have access to the string (inside an addListener callback), it's already in the "garbled" state. |
@ossiangrr is there an actual difference when looking at the buffer's state directly? check this. use |
Yeah, those still come out as the "same character" using console.dir.. so it would have to be something inside node-irc. |
I've found references in node-irc's forums about "encoding" patches but I don't understand node and/or github enough to figure out if I can use this patch: #113 I have also found this: https://github.com/bnoordhuis/node-iconv Maybe the core node-irc team could work with these links better than me? |
Did anyone figure out a solution, in node-irc or outside? I have both ISO-8859-1 users and UTF-8 users. |
I'm not sure if this is a problem with irc in general, or with Javascript, or node.
I have been writing a simple bot that works as a search engine for a card game (VTES). Some cards have names with foreign characters, and I'd like them to be searchable by literal character.
I am listening with addListener("message#",callback) and addListener("pm",callback)
If someone sends a UTF-8 character -- say, ö or ç -- it works great!
But if their encoding is ISO-8859-1, my bot sees all of the "special" characters as the same character sequence: �
Not even a different sequence of bytes that I could brute-force translate.
How can I get my bot to see these as different characters?
Or is this just a limitation of javascript/node that I'll have to suck up and deal?
(I do have an option for users to search by "ascii-ized" versions of the name, so there's a workaround, but it would be nice if I could handle more literally-typed or copy-pasted strings)
Here is a real-world excerpt.
In the first of each of these cases, the "foreign" character is UTF-8. In the second case, it is ISO-8859-1.
-> gramle whois Zöe
Gramle Zöe. Clan: Malkavian Group: 2 Capacity: 3 cel obf AUS
Gramle Camarilla: Zöe does not get the usual +1 stealth when hunting.
-> gramle whois Zöe
Gramle No results found for 'whois Z�e'.
-> gramle whois Monçada
Gramle Ambrosio Luis Monçada, Plenipotentiary. Clan: Lasombra Group: 2 Capacity: 10 aus for DOM OBT POT PRE
Gramle Sabbat cardinal: Monçada cannot block. Other Methuselahs' actions targeting Monçada cost an additional pool. If Monçada is ready during your discard phase, he can untap another ready Lasombra.
-> gramle whois Monçada
Gramle No results found for 'whois Mon�ada'.
The text was updated successfully, but these errors were encountered: