-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
split: better handle numeric and hex suffixes, short and long, with and without values #5198
Conversation
…nd without values Fixes uutils#5171
GNU testsuite comparison:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I think this can be cleaned up a bit more by offloading the overriding of the arguments to clap.
src/uu/split/src/split.rs
Outdated
.default_missing_value("0") | ||
.num_args(0..=1) | ||
.overrides_with(OPT_NUMERIC_SUFFIXES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each argument should also override with the corresponding option right? Maybe we could also add some tests for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, tests are actually covering that, but will refactor with .overrides_with_all() instead
src/uu/split/src/split.rs
Outdated
Ok("0".to_string()) // force value to "0" | ||
}) | ||
.overrides_with(OPT_NUMERIC_SUFFIXES_SHORT) | ||
.help("use numeric suffixes instead of alphabetic"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's nice to say something like "like --numeric-suffixes but does not accept an argument" to explain why there are two arguments that do the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understood the suggestion, could you please exand on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggestion is to put a bit of text in the help
string saying that it's the same as --numeric-suffixes
. If I wouldn't know about these arguments and I would see two arguments with the same description in --help
, I'd be a bit confused. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, that makes sense, thanks. Will update it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Followed exact same order and wording as in GNU split help:
-d use numeric suffixes starting at 0, not alphabetic
--numeric-suffixes[=FROM]
same as -d, but allow setting the start value
-x use hex suffixes starting at 0, not alphabetic
--hex-suffixes[=FROM]
same as -x, but allow setting the start value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect!
src/uu/split/src/split.rs
Outdated
.value_parser(|_: &str| -> Result<String,clap::Error> { | ||
Ok("0".to_string()) // force value to "0" | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels convoluted. We have have to combine the arguments anyway, so it would suffice for this to just be a flag without any value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well ... it is a flag as far as behaviour goes (and as far as clap works), but it does produce a value. If -d or -x is used , the value must be specifically set to 0 for numeric or hex suffix correspondingly. It can be set here with value_parser()
or harcoded later on when suffixes are processed. Would it be better to have all defaults in the same place as arg definitions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see where you're coming from. It's an interesting idea to try to make all the flags sort of the same in the code. However, I think I still prefer hardcoding them in the processing of the flags. The thing that I find confusing is that if it is a flag, we should use the functions that are appropriate for flags if that makes sense. value_parser
is usually used for parsing user generated values, which we don't have here, so hardcoding the 0
gives us the ability to skip the parsing.
So, for instance, I'd write the match expression to combine the flags something more like this:
match (
matches.get_one::<String>(OPT_NUMERIC_SUFFIXES),
matches.get_one::<String>(OPT_HEX_SUFFIXES),
matches.get_flag(OPT_NUMERIC_SUFFIXES_SHORT),
matches.get_flag(OPT_HEX_SUFFIXES_SHORT),
) {
(Some(v), _, _, _) => parse_num_suffix(matches, v),
(_, Some(v), _, _) => parse_hex_suffix(matches, v),
(_, _, true, _) => Ok((SuffixType::Decimal, 0)),
(_, _, _, true) => Ok((SuffixType::Hexadecimal, 0)),
_ => Ok((SuffixType::Alphabetic, 0)), // no numeric/hex suffix, using default alphabetic
}
(I forgot the exact semantics of those clap
functions. Maybe the value source check is necessary as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored to remove arg .value_parser()
and hardcoded default 0
in the fn suffix_type_from()
src/uu/split/src/split.rs
Outdated
// Compare suffixes by the order specified in command line | ||
// last one (highest index) is effective, all others are ignored |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's much easier to make all the arguments override each other :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, refactoring with overrides_with_all()
in args definitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! It's looking pretty nice already. I just have a few more small suggestions/questions.
src/uu/split/src/split.rs
Outdated
.default_missing_value("0") | ||
.num_args(0..=1) | ||
.help("use numeric suffixes instead of alphabetic"), | ||
.overrides_with_all([OPT_NUMERIC_SUFFIXES,OPT_NUMERIC_SUFFIXES_SHORT,OPT_HEX_SUFFIXES,OPT_HEX_SUFFIXES_SHORT]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why rustfmt doesn't pick this up? There's probably some string in this builder that makes rustfmt give up on the whole expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep if you break up this line:
coreutils/src/uu/split/src/split.rs
Line 115 in 4698e8c
"write to shell COMMAND; file name is $FILE (Currently not implemented for Windows)", |
into
.help(
"write to shell COMMAND; file name is $FILE \
(Currently not implemented for Windows)",
),
rustfmt will kick in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting ... unfortunately for this specific line rustfmt does not do anything regardless how you break it up, so I will format it by hand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is indeed what I meant. If you break it up by hand rustfmt will start working again on the other lines.
src/uu/split/src/split.rs
Outdated
matches.value_source(OPT_NUMERIC_SUFFIXES) == Some(ValueSource::CommandLine), | ||
matches.value_source(OPT_NUMERIC_SUFFIXES_SHORT) == Some(ValueSource::CommandLine), | ||
matches.value_source(OPT_HEX_SUFFIXES) == Some(ValueSource::CommandLine), | ||
matches.value_source(OPT_HEX_SUFFIXES_SHORT) == Some(ValueSource::CommandLine), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid value_source
? If I recall correctly, we only have to use that if there is some default value, but I don't think these arguments have one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah ... considering they are all set to override each other and no default_value, can refactor
src/uu/split/src/split.rs
Outdated
// Note: right now, this exact behavior cannot be handled by | ||
// `ArgGroup` since `ArgGroup` considers a default value `Arg` | ||
// as "defined". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this. What do you mean with that last part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, my bad - copy/pasted from other section to start with and did not remove. will take it out
tests/by-util/test_split.rs
Outdated
#[test] | ||
fn test_hex_suffix_no_value() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
ucmd.args(&["-n", "4", "--hex-suffixes", "threebytes.txt"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should test with a larger file here so that we can verify that hex is used (same for the numeric suffix tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, will do
Thanks! One question : considering that changes for this pull request addressing issue #5171 modify the same two files as the changes for fixing issue #5033 (handling obsolete lines call as in |
Separate would be best I think. This PR is fairly simple. The other one will be a lot more complicated. So let's get this merged and then move on to the other one. |
anything else that you think should be included and/or changed to finalize this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! I love it! Thanks!
Short variants of numeric and hex suffixes now act as flags with value forced to 0
Long variants work without value (default 0) as well as with value (equal sign is enforced)
Any combination of numeric and/or hex suffixes in any order is allowed with last one winning
Short variants can be combined together like
split -dxen 4 file
Tests added to cover changes and verified against GNU
split
to match behaviorFixes #5171