Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement missing encodings #18

Open
8 of 15 tasks
lifthrasiir opened this issue Nov 1, 2013 · 3 comments
Open
8 of 15 tasks

Implement missing encodings #18

lifthrasiir opened this issue Nov 1, 2013 · 3 comments

Comments

@lifthrasiir
Copy link
Owner

This is a master list for important missing encodings. What is considered "important" is a delicate question, but for now I have the following list:


WHATWG multibyte encodings

  • gbk
  • gb18030
  • hz-gb-2312
  • big5
  • iso-2022-jp
  • utf-16be
  • utf-16le

Non-WHATWG multibyte encodings of the special interest

  • JIS X 0213 encodings: euc-jis-2004, iso-2022-jp-2004, shift-jis-2004
  • Other encodings based on Shift_JIS (WHATWG's shift_jis is actually windows-31j)
  • cesu-8, required for compatibility

Required for completeness

  • iso-8859-1 (compare with windows-1252)
  • euc-kr (compare with windows-949)
  • euc-jp with and without JIS X 0212 compatibility (WHATWG's euc-jp is asymmetric: it encodes without 0212 and decodes with 0212)
  • utf-32 and friends
  • utf-7?
@lifthrasiir
Copy link
Owner Author

As of b5bdc62 all WHATWG encodings have been implemented. Not all encodings are fully verified though.

@zeld
Copy link

zeld commented Sep 7, 2016

Do you have any interest in supporting EBCDIC encodings, maybe behind a feature since they are not the most commonly used? I know that Java provides a built-in support for a limited set, iconv and icu offer more extended support.
If so, I could provide help, for example by providing tables with similar syntax as https://encoding.spec.whatwg.org/index-iso-8859-15.txt

@ssokolow
Copy link

ssokolow commented Feb 6, 2017

Would you be willing to consider supporting IBM code page 437? (The original DOS/IBM PC codepage)

So far, the only Rust support I've been able to find is a read-only crate named cp437 which doesn't inspire confidence. (The heading correctly says "cp437", but then it's typo'd as "cp537" in the very first non-heading line of the README and there's no unit test badge.)

I ask because I have a Python script for generating batch file menus for DOSBox and my retro PC (which means I need to encode box-drawing characters to cp437) and it'd be nice if I could port it to Rust to get more compile-time correctness enforcement and easier distribution to others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants