Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cut: fix -d= (#2424) #2428

Merged
merged 1 commit into from
Jun 19, 2021
Merged

cut: fix -d= (#2424) #2428

merged 1 commit into from
Jun 19, 2021

Conversation

jhscheer
Copy link
Contributor

@jhscheer jhscheer commented Jun 18, 2021

fix for #2424

GNU's cut supports -d= to set the delimiter to =.
Clap parsing is limited in this situation, see.
Since clap parsing handles -d= as delimiter explicitly set to "" we can use this as basis for a simple workaround.

The downside is that uu_cut loses the ability to handle empty delimiters.
I don't know how often this is used, or if it is useful at all, but I consider this an acceptable tradeoff.
GNU's behaviour for an empty delimiter is to echo the input.

Other alternatives for a workaround to fix #2424 could be:

  1. parse/regex args before calling App::get_matches()
  2. set Arg::with_name(options::DELIMITER).empty_values(false) use App::get_matches_safe(), handle the error manually and set the delimiter to =

@ankurdhama
Copy link

I think you can just replace "\0".to_owned() with "=".to_owned()

@jhscheer
Copy link
Contributor Author

I think you can just replace "\0".to_owned() with "=".to_owned()

Which line are you referring to?
Also I'm confused because this PR doesn't have a line containing the change you propose!?

@ankurdhama
Copy link

I think you can just replace "\0".to_owned() with "=".to_owned()

Which line are you referring to?
Also I'm confused because this PR doesn't have a line containing the change you propose!?

Sorry about that, line number 544 in original file (line number 553 in the PR file).

@jhscheer
Copy link
Contributor Author

Sorry about that, line number 544 in original file (line number 553 in the PR file).

Oh thanks for mentioning this, I didn't notice that.

@jhscheer
Copy link
Contributor Author

I don't know why

let delim = if delim.is_empty() {
"\0".to_owned()
} else {
delim.to_owned()
};

is there. This doesn't seem to be the behaviour of GNU's cut:

$ echo "--libdir=./out/lib" | gcut -f2 -d "" | hexdump -C
00000000  2d 2d 6c 69 62 64 69 72  3d 2e 2f 6f 75 74 2f 6c  |--libdir=./out/l|
00000010  69 62 0a                                          |ib.|
00000013

$ echo "--libdir=./out/lib" | hexdump -C
00000000  2d 2d 6c 69 62 64 69 72  3d 2e 2f 6f 75 74 2f 6c  |--libdir=./out/l|
00000010  69 62 0a                                          |ib.|
00000013

@tertsdiepraam
Copy link
Member

Nice! The only difference with GNU I can still find is:

$ # GNU
$ echo "a=b" | cut -f2 --delimiter=
a=b
$ # uutils
# echo "a=b" | cut -f2 --delimiter=
b

But I guess that's not useful anyway.

@sylvestre sylvestre merged commit e2a00b6 into uutils:master Jun 19, 2021
@moxuze
Copy link

moxuze commented Apr 10, 2022

I noticed that -d '' means '\0' (NULL character) in GNU's cut, please see:
https://github.com/coreutils/coreutils/blob/cc01b8a8f43bd1e02339322595f7a20e72a8c155/src/cut.c#L497

I tried these commands in my zsh (ArchLinux):

$ echo 'ab\0cd' | cut -f 1,2 -d '' --output-delimiter=Z | hexdump -C
00000000  61 62 5a 63 64 0a                                 |abZcd.|
00000006
$ echo 'ab\0cd' | uu-cut -f 1,2 -d '' --output-delimiter=Z | hexdump -C
00000000  61 62 00 63 64 0a                                 |ab.cd.|
0000000

Using -d '' to set the delimiter to = may not be a good idea.

"\0".to_owned()

And this line cannot be execute in any situation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cut: -d= works differently than gnu-cut
5 participants