thousands separator for the --size=bytes option would be very useful #533

peter-joo · 2021-06-28T17:30:19Z

OS: Linux 5.12.9-1-MANJARO x86_64 GNU/Linux
lsd --version: lsd 0.20.1
echo $TERM: xterm-256color
echo $LS_COLORS: rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.zst=01;31:.tzst=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.wim=01;31:.swm=01;31:.dwm=01;31:.esd=01;31:.jpg=01;35:.jpeg=01;35:.mjpg=01;35:.mjpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.webp=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.ogv=01;35:.ogx=01;35:.aac=00;36:.au=00;36:.flac=00;36:.m4a=00;36:.mid=00;36:.midi=00;36:.mka=00;36:.mp3=00;36:.mpc=00;36:.ogg=00;36:.ra=00;36:.wav=00;36:.oga=00;36:.opus=00;36:.spx=00;36:.xspf=00;36:

Expected behavior:

It is very hard to quickly interpret/recognize the real file/directory sizes when the --size=bytes option is given:

>lsd --size=bytes --sort=size --reverse`
.rw-r--r-- p p          1  Mon Jun 28 19:12:04 2021  file_1.dat
.rw-r--r-- p p         12  Mon Jun 28 19:12:04 2021  file_2.dat
.rw-r--r-- p p        123  Mon Jun 28 19:12:04 2021  file_3.dat
.rw-r--r-- p p       1234  Mon Jun 28 19:12:04 2021  file_4.dat
.rw-r--r-- p p      12345  Mon Jun 28 19:12:04 2021  file_5.dat
.rw-r--r-- p p     123456  Mon Jun 28 19:12:04 2021  file_6.dat
.rw-r--r-- p p    1234567  Mon Jun 28 19:12:04 2021  file_7.dat
.rw-r--r-- p p   12345678  Mon Jun 28 19:12:04 2021  file_8.dat
.rw-r--r-- p p  123456789  Mon Jun 28 19:12:04 2021  file_9.dat
.rw-r--r-- p p 1234567890  Mon Jun 28 19:12:06 2021  file_10.dat

However the other/similar tool called exa ( https://github.com/ogham/exa ) includes the thousands separator by default:

>exa --bytes --long --sort=size`
.rw-r--r--             1 p 28 Jun 19:12 file_1.dat
.rw-r--r--            12 p 28 Jun 19:12 file_2.dat
.rw-r--r--           123 p 28 Jun 19:12 file_3.dat
.rw-r--r--         1,234 p 28 Jun 19:12 file_4.dat
.rw-r--r--        12,345 p 28 Jun 19:12 file_5.dat
.rw-r--r--       123,456 p 28 Jun 19:12 file_6.dat
.rw-r--r--     1,234,567 p 28 Jun 19:12 file_7.dat
.rw-r--r--    12,345,678 p 28 Jun 19:12 file_8.dat
.rw-r--r--   123,456,789 p 28 Jun 19:12 file_9.dat
.rw-r--r-- 1,234,567,890 p 28 Jun 19:12 file_10.dat

Actual behavior

Extra cognitive load without those thousands separators :(

The text was updated successfully, but these errors were encountered:

meain · 2021-06-29T04:31:46Z

This might not be a good idea. This will cause issues for people who might be using lsd in a script and grepping for the size part. I don't think breaking compatibility with gnu ls here would be a good idea.

peter-joo · 2021-06-29T05:34:40Z

Well, I really wanted to describe what to achieve, not how to achieve.

Also I agree, a previous ticket was by someone who used awk to parse lsd's output and due to space (or other separators) the parsing has failed: #254 (comment)

But there is a very easy way out, which solves all aspect of the problem:
- do not (ever) add thousands separator when the --size=bytes option is used
- only add thousands separator when a new suboption is used, ie the --size=bytes_with_thousands_separator option is used for example

I hope it clears :)

meain · 2021-06-29T12:48:59Z

Just wondering what a good option name would be? 🤔 bytes_with_thousands_separator is a bit too long. Or maybe even a separate option like --num-separators which someone can set to on,off,auto and auto will disable if we detect a pipe?

peter-joo · 2021-06-29T12:53:35Z

It is perfectly up to you and up to the project owners, other contributors, etc. how to do it.

For me even the --size=fancy_bytes works :)

zwpaper · 2021-06-30T03:35:30Z

I would vote for a separated flag --num-separators, as we could apply the separator to B, MB, GB, and even UNIX timestamp may be an option to be applied.

meain · 2021-07-01T03:50:45Z

Not sure if it will be useful in MB/GB etc as that will break off to next unit at around thousand. As for UNIX timestamp, I don't think comma in a timestamp looks natural. Nobody really reads a timestamp.

zwpaper · 2021-07-01T06:39:25Z

Oh, my bad, I did not notice that there is no MB or GB option for size.

also, it makes me a little bit awkward leaving me the only one reading timestamp😅.

but as the --num-separators option would only affect the byte-size, it seems that an opinion for --size might be reasonable.

merkrafter · 2021-07-26T18:32:37Z

Localization might have to be considered here as well, as some countries use dots for separating thousands. Not sure if that's a real problem though.

arkadiuszbielewicz · 2021-10-15T06:46:56Z

Hi, I was thinking about this issue and I've two questions:

System specific localization - there is num_format library which could provide us with system specific formatting, unfortunately for Windows it requires Clang. Is that a problem? Could Windows build be adjusted to deal with that?
Flags discussion - personally I'm more into adding option for --size flag, with name bytes_with_separators, are there any objections?

meain · 2021-10-15T08:27:47Z

The solution you bring up actually sound pretty good. Also the word thousands does not make sense anyway. I forgot that in my country we actually separate by hundreds after the first set 😂. bytes-with-separator seems to be good flag.

That said, I am not a big fan of adding clang as a dependency and that too just for Windows. None of the maintainers as far as I know use Windows and adding more brittleness to that platform is probably gonna make things worse.

…ery useful | Added support for thousand separated bytes

…ery useful | Use system setting to determine formatting - only on unix

…ery useful | Potential fix for unix musl

zwpaper added Hacktoberfest kind/enhancement Enhancement on current feature labels Oct 7, 2021

arkadiuszbielewicz pushed a commit to arkadiuszbielewicz/lsd that referenced this issue Oct 15, 2021

lsd-rs#533 thousands separator for the --size=bytes option would be v…

932d7eb

…ery useful | Added support for thousand separated bytes

arkadiuszbielewicz mentioned this issue Oct 15, 2021

Support for bytes separated by thousand with comma #569

Open

3 tasks

arkadiuszbielewicz pushed a commit to arkadiuszbielewicz/lsd that referenced this issue Nov 4, 2021

lsd-rs#533 thousands separator for the --size=bytes option would be v…

7ec07b6

…ery useful | Use system setting to determine formatting - only on unix

arkadiuszbielewicz pushed a commit to arkadiuszbielewicz/lsd that referenced this issue Nov 4, 2021

lsd-rs#533 thousands separator for the --size=bytes option would be v…

6f1b202

…ery useful | Potential fix for unix musl

meain removed the Hacktoberfest label Feb 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

thousands separator for the --size=bytes option would be very useful #533

thousands separator for the --size=bytes option would be very useful #533

peter-joo commented Jun 28, 2021

meain commented Jun 29, 2021

peter-joo commented Jun 29, 2021

meain commented Jun 29, 2021

peter-joo commented Jun 29, 2021

zwpaper commented Jun 30, 2021

meain commented Jul 1, 2021

zwpaper commented Jul 1, 2021

merkrafter commented Jul 26, 2021

arkadiuszbielewicz commented Oct 15, 2021 •

edited

Loading

meain commented Oct 15, 2021

thousands separator for the --size=bytes option would be very useful #533

thousands separator for the --size=bytes option would be very useful #533

Comments

peter-joo commented Jun 28, 2021

Expected behavior:

Actual behavior

meain commented Jun 29, 2021

peter-joo commented Jun 29, 2021

meain commented Jun 29, 2021

peter-joo commented Jun 29, 2021

zwpaper commented Jun 30, 2021

meain commented Jul 1, 2021

zwpaper commented Jul 1, 2021

merkrafter commented Jul 26, 2021

arkadiuszbielewicz commented Oct 15, 2021 • edited Loading

meain commented Oct 15, 2021

arkadiuszbielewicz commented Oct 15, 2021 •

edited

Loading