Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SI standard should be default #137

Closed
getsnoopy opened this issue Jul 19, 2021 · 11 comments · Fixed by #138
Closed

SI standard should be default #137

getsnoopy opened this issue Jul 19, 2021 · 11 comments · Fixed by #138

Comments

@getsnoopy
Copy link
Contributor

I noticed that the project defaults to base-2 calculations using the JEDEC "standard", which is arguably the worst of both worlds.

Firstly and most importantly, 96% of the world (and science within the US) uses the metric system (SI) regularly, where the prefixes kilo-, mega-, etc. all have universal and precise meanings: thousand, million, etc. As such, most of the world understands what these prefixes mean and would expect them to mean the same thing when it comes to bits and bytes, and rightly so: because that is indeed what they mean. JEDEC, however, calls for the unit names and symbols to be represented as if they are SI-based, but actually defines them differently (e.g., kilobyte and KB mean 1024 bytes, etc.). This is deceptive, and has been the basis for much confusion and even lawsuits.

To fix this once and for all, the IEC has come up with different names and symbols for the base-2 units. Most operating systems, wisely, switched everything over to SI units. In fact, about 70% of the devices of the world use SI prefixes for everything but RAM by default; Windows is the only unfortunate holdout. Since most people in the world are familiar with the SI and are not in the esoteric field of memory in computer science, they expect SI units. So it's a bit ironic to refer to the package as a way "to get a human readable file size" when it only caters to 30% of the humans of the world.

To add to the confusion, regardless of JEDEC or SI, most contexts where bits would be used are related to networking, where the prefixes mean the SI definition (base 10) almost always anyway! This can be considered a bug at the moment where the package calculates bits using base-2 by default as well, which is inappropriate in most use cases.

Please change the default to base 10 and remove JEDEC completely; if people want to use base-2, they have the option to do so with the IEC standard, but it at least wouldn't lie to them about what they're seeing. I understand that this would be backwards-incompatible, so bumping to another major version is to be expected, but I think it's important and worth it.

@avoidwork
Copy link
Owner

2.0.0 was SI in 2013.

@avoidwork
Copy link
Owner

My primary use case is memory, so it works just great for me.

@getsnoopy
Copy link
Contributor Author

Sure, but why not default to IEC at least? JEDEC is just misleading.

@avoidwork
Copy link
Owner

avoidwork commented Jul 21, 2021

JEDEC is not misleading; memory and files on disk are still base 2. Marketing is base 10, to make it easier for humans to communicate general statements about "things".

IEC came later in the code than the JEDEC units; that's the only reason.

@getsnoopy
Copy link
Contributor Author

Files on disk are, but most people do not read or report file sizes on disk; they report actual file sizes, which are base 10.

Regardless, my point is not that base 2 is not common, but that JEDEC uses base 10 names and symbols for base 2, which is misleading. "MB" means megabyte to anyone reading it, but defining it as 1024² bytes doesn't serve anyone. "MiB", on the other hand, makes it clear what is meant. So if you prefer base 2 units as the default, why not prefer ones that aren't misleading?

@avoidwork
Copy link
Owner

JEDEC units are not base10 units, see https://en.wikipedia.org/wiki/JEDEC_memory_standards#Units_of_information

Base 10 (decimal) is expressed with a lowercase 'k' for kilo, anything that's base2 is uppercase 'K'. See https://en.wikipedia.org/wiki/Kilobit

This is why I have no interest in changing the default, virtually no one knows what's what.

@getsnoopy
Copy link
Contributor Author

JEDEC units are not base10 units

That's exactly my point. JEDEC uses base 10 (i.e. SI) unit names and symbols to refer to base 2 quantities, which is misleading. This is excepting kilo- for its unit symbol, for which it uses uppercase k, but very few people know that the symbol for prefix kilo- is meant to only be lowercase, so an uppercase k indicates a binary interpretation of the prefix. You can see this when people write "Kg" or "KG" for kilogram, for example. For all other units, JEDEC uses the same symbols as SI.

This doesn't change the fact that JEDEC keeps the unit names the same: KB for "kilobyte" (actually 1024 bytes, which is not a kilobyte), MB for "megabyte" (actually 1 048 576 bytes, which is not a megabyte), etc. Hence, my point about JEDEC simply being misleading.

virtually no one knows what's what

This is why I linked to the stats that show that 70% of the devices of the world have operating systems that show the proper SI units to their users, which is notwithstanding that almost all non-technical people in the world consistently expect SI (read: proper) interpretations for the unit names.

@avoidwork avoidwork reopened this Jul 24, 2021
@avoidwork
Copy link
Owner

avoidwork commented Jul 24, 2021

Now I understand where you're coming from; I have the same response when I hear "decimate" used incorrectly due to MCU.

I reopened the ticket; if you want, make a PR to change it to IEC as default, or SI. I don't have a strong opinion, I'm interested in the least issue drama from such a change. I'd wager IEC is the change that people will accept.

@getsnoopy
Copy link
Contributor Author

Sounds good. I will create one that will change the default to SI, since that's the most common (Node.js is not used for many RAM applications AFAIK). But that PR would depend on my other bit PR being merged, so could you please reopen that?

@avoidwork
Copy link
Owner

Re-opened; on sabbatical atm so I'll be slow to respond.

getsnoopy added a commit to getsnoopy/filesize.js that referenced this issue Aug 8, 2021
This commit makes the SI standard (base 10) the generic default, and the
IEC standard the default for base-2 formatting in order to make it as
intuitive as possible, to reduce any ambiguity, and to mirror the
industry standard.

Fixes avoidwork#137
@avoidwork
Copy link
Owner

Released as 8.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants