Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] customize units for file sizes #280

Open
jameslamb opened this issue Nov 14, 2024 · 0 comments
Open

[feature request] customize units for file sizes #280

jameslamb opened this issue Nov 14, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@jameslamb
Copy link
Owner

What change would you like to see?

There are several places in log messages and --inspect output where pydistcheck refers to file sizes.

For example, in --inspect:

----- package inspection summary -----
file size
  * compressed size: 0.8G
  * uncompressed size: 1.7G

It uses a heuristic to decide which unit to display (e.g., where 50 million bytes should be 0.05G or 50M or 50000K or 50000000B).

def _recommend_size_str(num_bytes: int) -> Tuple[float, str]:
if num_bytes < int(0.1 * 1024):
return float(num_bytes), "B"
if num_bytes <= (0.1 * 1024**2):
return float(num_bytes) / 1024.0, "K"
if num_bytes <= (0.1 * 1024**3):
return float(num_bytes) / (1024**2), "M"
return float(num_bytes) / (1024**3), "G"

pydistcheck should support overriding that heuristic and saying "print all sizes in exactly this unit".

How would implementing this improve pydistcheck?

Some rounding is done and therefore precision lost when pydistcheck automatically converts a size in bytes to a human-readable string. If it were possible to force pydistcheck to use exactly bytes for every size, no rounding would happen and you could get exact sizes.

Forcing the same units is also helpful for doing quick at-a-glance comparisons (e.g. it's easy to tell that 800M is 8x larger than 100M... not so with 0.8G).

Might also be useful to simplify things for programmatic consumers of this data... e.g. code processing a bunch of pydistcheck summaries created from a collection of many packages (related: #116).

Notes

For inspiration, see how du handles this (docs)

-b, --bytes: equivalent to '--apparent-size --block-size=1'
-k: like --block-size=1K
-m: like --block-size=1M

-B, --block-size=SIZE
scale sizes by SIZE before printing them; e.g., '-BM'
prints sizes in units of 1,048,576 bytes; see SIZE format
below

This should affect all the following:

@jameslamb jameslamb added the enhancement New feature or request label Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant