-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
split: better handle numeric and hex suffixes, short and long, with and without values #5198
Changes from 1 commit
42df6ac
fc3bf94
16c8ad8
81062e0
d658f26
fb3b0be
4698e8c
6005908
54dc1d8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,7 +5,7 @@ | |
// * For the full copyright and license information, please view the LICENSE | ||
// * file that was distributed with this source code. | ||
|
||
// spell-checker:ignore nbbbb ncccc | ||
// spell-checker:ignore nbbbb ncccc hexdigit | ||
|
||
mod filenames; | ||
mod number; | ||
|
@@ -34,7 +34,9 @@ static OPT_ADDITIONAL_SUFFIX: &str = "additional-suffix"; | |
static OPT_FILTER: &str = "filter"; | ||
static OPT_NUMBER: &str = "number"; | ||
static OPT_NUMERIC_SUFFIXES: &str = "numeric-suffixes"; | ||
static OPT_NUMERIC_SUFFIXES_SHORT: &str = "-d"; | ||
static OPT_HEX_SUFFIXES: &str = "hex-suffixes"; | ||
static OPT_HEX_SUFFIXES_SHORT: &str = "-x"; | ||
static OPT_SUFFIX_LENGTH: &str = "suffix-length"; | ||
static OPT_DEFAULT_SUFFIX_LENGTH: &str = "0"; | ||
static OPT_VERBOSE: &str = "verbose"; | ||
|
@@ -125,28 +127,50 @@ pub fn uu_app() -> Command { | |
) | ||
.arg( | ||
Arg::new(OPT_NUMERIC_SUFFIXES) | ||
.short('d') | ||
.long(OPT_NUMERIC_SUFFIXES) | ||
.require_equals(true) | ||
.default_missing_value("0") | ||
.num_args(0..=1) | ||
.overrides_with(OPT_NUMERIC_SUFFIXES) | ||
.help("use numeric suffixes instead of alphabetic"), | ||
) | ||
.arg( | ||
Arg::new(OPT_SUFFIX_LENGTH) | ||
.short('a') | ||
.long(OPT_SUFFIX_LENGTH) | ||
.value_name("N") | ||
.default_value(OPT_DEFAULT_SUFFIX_LENGTH) | ||
.help("use suffixes of fixed length N. 0 implies dynamic length."), | ||
Arg::new(OPT_NUMERIC_SUFFIXES_SHORT) | ||
.short('d') | ||
.action(clap::ArgAction::SetTrue) | ||
.value_parser(|_: &str| -> Result<String,clap::Error> { | ||
Ok("0".to_string()) // force value to "0" | ||
}) | ||
.overrides_with(OPT_NUMERIC_SUFFIXES_SHORT) | ||
.help("use numeric suffixes instead of alphabetic"), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it's nice to say something like "like --numeric-suffixes but does not accept an argument" to explain why there are two arguments that do the same thing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not sure I understood the suggestion, could you please exand on that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The suggestion is to put a bit of text in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep, that makes sense, thanks. Will update it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Followed exact same order and wording as in GNU split help: --numeric-suffixes[=FROM] -x use hex suffixes starting at 0, not alphabetic --hex-suffixes[=FROM] There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perfect! |
||
) | ||
.arg( | ||
Arg::new(OPT_HEX_SUFFIXES) | ||
.short('x') | ||
.long(OPT_HEX_SUFFIXES) | ||
.default_missing_value("0") | ||
.require_equals(true) | ||
.num_args(0..=1) | ||
.overrides_with(OPT_HEX_SUFFIXES) | ||
.help("use hex suffixes instead of alphabetic"), | ||
) | ||
.arg( | ||
Arg::new(OPT_HEX_SUFFIXES_SHORT) | ||
.short('x') | ||
.action(clap::ArgAction::SetTrue) | ||
.value_parser(|_: &str| -> Result<String,clap::Error> { | ||
Ok("0".to_string()) // force value to "0" | ||
}) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This feels convoluted. We have have to combine the arguments anyway, so it would suffice for this to just be a flag without any value. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. well ... it is a flag as far as behaviour goes (and as far as clap works), but it does produce a value. If -d or -x is used , the value must be specifically set to 0 for numeric or hex suffix correspondingly. It can be set here with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see where you're coming from. It's an interesting idea to try to make all the flags sort of the same in the code. However, I think I still prefer hardcoding them in the processing of the flags. The thing that I find confusing is that if it is a flag, we should use the functions that are appropriate for flags if that makes sense. So, for instance, I'd write the match expression to combine the flags something more like this: match (
matches.get_one::<String>(OPT_NUMERIC_SUFFIXES),
matches.get_one::<String>(OPT_HEX_SUFFIXES),
matches.get_flag(OPT_NUMERIC_SUFFIXES_SHORT),
matches.get_flag(OPT_HEX_SUFFIXES_SHORT),
) {
(Some(v), _, _, _) => parse_num_suffix(matches, v),
(_, Some(v), _, _) => parse_hex_suffix(matches, v),
(_, _, true, _) => Ok((SuffixType::Decimal, 0)),
(_, _, _, true) => Ok((SuffixType::Hexadecimal, 0)),
_ => Ok((SuffixType::Alphabetic, 0)), // no numeric/hex suffix, using default alphabetic
} (I forgot the exact semantics of those There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. refactored to remove arg |
||
.overrides_with(OPT_HEX_SUFFIXES_SHORT) | ||
.help("use hex suffixes instead of alphabetic"), | ||
) | ||
.arg( | ||
Arg::new(OPT_SUFFIX_LENGTH) | ||
.short('a') | ||
.long(OPT_SUFFIX_LENGTH) | ||
.value_name("N") | ||
.default_value(OPT_DEFAULT_SUFFIX_LENGTH) | ||
.help("use suffixes of fixed length N. 0 implies dynamic length."), | ||
) | ||
.arg( | ||
Arg::new(OPT_VERBOSE) | ||
.long(OPT_VERBOSE) | ||
|
@@ -411,15 +435,48 @@ impl Strategy { | |
|
||
/// Parse the suffix type from the command-line arguments. | ||
fn suffix_type_from(matches: &ArgMatches) -> Result<(SuffixType, usize), SettingsError> { | ||
if matches.value_source(OPT_NUMERIC_SUFFIXES) == Some(ValueSource::CommandLine) { | ||
let suffix_start = matches.get_one::<String>(OPT_NUMERIC_SUFFIXES); | ||
// Collect provided suffixes (if any) | ||
// Any combination is allowed | ||
let mut suffixes = vec![]; | ||
for suffix_id in [ | ||
OPT_NUMERIC_SUFFIXES, | ||
OPT_NUMERIC_SUFFIXES_SHORT, | ||
OPT_HEX_SUFFIXES, | ||
OPT_HEX_SUFFIXES_SHORT, | ||
] | ||
.into_iter() | ||
{ | ||
if matches.value_source(suffix_id) == Some(ValueSource::CommandLine) { | ||
suffixes.push((matches.index_of(suffix_id), suffix_id)); | ||
} | ||
} | ||
// Compare suffixes by the order specified in command line | ||
// last one (highest index) is effective, all others are ignored | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's much easier to make all the arguments override each other :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep, refactoring with |
||
let mut effective_suffix_index = 0; | ||
let mut effective_suffix_id = ""; | ||
for suffix in suffixes.into_iter() { | ||
let (suffix_index, suffix_id) = suffix; | ||
if let Some(i) = suffix_index { | ||
if i >= effective_suffix_index { | ||
effective_suffix_index = i; | ||
effective_suffix_id = suffix_id; | ||
}; | ||
} | ||
} | ||
|
||
if effective_suffix_id == OPT_NUMERIC_SUFFIXES | ||
|| effective_suffix_id == OPT_NUMERIC_SUFFIXES_SHORT | ||
{ | ||
let suffix_start = matches.get_one::<String>(effective_suffix_id); | ||
let suffix_start = suffix_start.ok_or(SettingsError::SuffixNotParsable(String::new()))?; | ||
let suffix_start = suffix_start | ||
.parse() | ||
.parse::<usize>() | ||
.map_err(|_| SettingsError::SuffixNotParsable(suffix_start.to_string()))?; | ||
Ok((SuffixType::Decimal, suffix_start)) | ||
} else if matches.value_source(OPT_HEX_SUFFIXES) == Some(ValueSource::CommandLine) { | ||
let suffix_start = matches.get_one::<String>(OPT_HEX_SUFFIXES); | ||
} else if effective_suffix_id == OPT_HEX_SUFFIXES | ||
|| effective_suffix_id == OPT_HEX_SUFFIXES_SHORT | ||
{ | ||
let suffix_start = matches.get_one::<String>(effective_suffix_id); | ||
let suffix_start = suffix_start.ok_or(SettingsError::SuffixNotParsable(String::new()))?; | ||
let suffix_start = usize::from_str_radix(suffix_start, 16) | ||
.map_err(|_| SettingsError::SuffixNotParsable(suffix_start.to_string()))?; | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,7 @@ | |
// * | ||
// * For the full copyright and license information, please view the LICENSE | ||
// * file that was distributed with this source code. | ||
// spell-checker:ignore xzaaa sixhundredfiftyonebytes ninetyonebytes threebytes asciilowercase fghij klmno pqrst uvwxyz fivelines twohundredfortyonebytes onehundredlines nbbbb | ||
// spell-checker:ignore xzaaa sixhundredfiftyonebytes ninetyonebytes threebytes asciilowercase fghij klmno pqrst uvwxyz fivelines twohundredfortyonebytes onehundredlines nbbbb dxen | ||
|
||
use crate::common::util::{AtPath, TestScenario}; | ||
use rand::{thread_rng, Rng, SeedableRng}; | ||
|
@@ -715,7 +715,7 @@ fn test_multiple_of_input_chunk() { | |
#[test] | ||
fn test_numeric_suffix() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
ucmd.args(&["-n", "4", "--numeric-suffixes", "9", "threebytes.txt"]) | ||
ucmd.args(&["-n", "4", "--numeric-suffixes=9", "threebytes.txt"]) | ||
.succeeds() | ||
.no_stdout() | ||
.no_stderr(); | ||
|
@@ -728,7 +728,7 @@ fn test_numeric_suffix() { | |
#[test] | ||
fn test_hex_suffix() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
ucmd.args(&["-n", "4", "--hex-suffixes", "9", "threebytes.txt"]) | ||
ucmd.args(&["-n", "4", "--hex-suffixes=9", "threebytes.txt"]) | ||
.succeeds() | ||
.no_stdout() | ||
.no_stderr(); | ||
|
@@ -738,6 +738,158 @@ fn test_hex_suffix() { | |
assert_eq!(at.read("x0c"), ""); | ||
} | ||
|
||
#[test] | ||
fn test_numeric_suffix_no_equal() { | ||
new_ucmd!() | ||
.args(&["-n", "4", "--numeric-suffixes", "9", "threebytes.txt"]) | ||
.fails() | ||
.stderr_contains("split: cannot open '9' for reading: No such file or directory"); | ||
} | ||
|
||
#[test] | ||
fn test_hex_suffix_no_equal() { | ||
new_ucmd!() | ||
.args(&["-n", "4", "--hex-suffixes", "9", "threebytes.txt"]) | ||
.fails() | ||
.stderr_contains("split: cannot open '9' for reading: No such file or directory"); | ||
} | ||
|
||
/// Test for short numeric suffix not having any value | ||
#[test] | ||
fn test_short_numeric_suffix_no_value() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
ucmd.args(&["-n", "4", "-d", "threebytes.txt"]) | ||
.succeeds() | ||
.no_stdout() | ||
.no_stderr(); | ||
assert_eq!(at.read("x00"), "a"); | ||
assert_eq!(at.read("x01"), "b"); | ||
assert_eq!(at.read("x02"), "c"); | ||
assert_eq!(at.read("x03"), ""); | ||
} | ||
|
||
/// Test for long numeric suffix not having any value | ||
#[test] | ||
fn test_numeric_suffix_no_value() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
ucmd.args(&["-n", "4", "--numeric-suffixes", "threebytes.txt"]) | ||
.succeeds() | ||
.no_stdout() | ||
.no_stderr(); | ||
assert_eq!(at.read("x00"), "a"); | ||
assert_eq!(at.read("x01"), "b"); | ||
assert_eq!(at.read("x02"), "c"); | ||
assert_eq!(at.read("x03"), ""); | ||
} | ||
|
||
/// Test for short hex suffix not having any value | ||
#[test] | ||
fn test_short_hex_suffix_no_value() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
ucmd.args(&["-n", "4", "-x", "threebytes.txt"]) | ||
.succeeds() | ||
.no_stdout() | ||
.no_stderr(); | ||
assert_eq!(at.read("x00"), "a"); | ||
assert_eq!(at.read("x01"), "b"); | ||
assert_eq!(at.read("x02"), "c"); | ||
assert_eq!(at.read("x03"), ""); | ||
} | ||
|
||
/// Test for long hex suffix not having any value | ||
#[test] | ||
fn test_hex_suffix_no_value() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
ucmd.args(&["-n", "4", "--hex-suffixes", "threebytes.txt"]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should test with a larger file here so that we can verify that hex is used (same for the numeric suffix tests). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep, will do |
||
.succeeds() | ||
.no_stdout() | ||
.no_stderr(); | ||
assert_eq!(at.read("x00"), "a"); | ||
assert_eq!(at.read("x01"), "b"); | ||
assert_eq!(at.read("x02"), "c"); | ||
assert_eq!(at.read("x03"), ""); | ||
} | ||
|
||
/// Test for short numeric suffix having value provided after space - should fail | ||
#[test] | ||
fn test_short_numeric_suffix_with_value_spaced() { | ||
new_ucmd!() | ||
.args(&["-n", "4", "-d", "9", "threebytes.txt"]) | ||
.fails() | ||
.stderr_contains("split: cannot open '9' for reading: No such file or directory"); | ||
} | ||
|
||
/// Test for short numeric suffix having value provided after space - should fail | ||
#[test] | ||
fn test_short_hex_suffix_with_value_spaced() { | ||
new_ucmd!() | ||
.args(&["-n", "4", "-x", "9", "threebytes.txt"]) | ||
.fails() | ||
.stderr_contains("split: cannot open '9' for reading: No such file or directory"); | ||
} | ||
|
||
/// Test for some combined short options | ||
#[test] | ||
fn test_short_combination() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
ucmd.args(&["-dxen", "4", "threebytes.txt"]) | ||
.succeeds() | ||
.no_stdout() | ||
.no_stderr(); | ||
assert_eq!(at.read("x00"), "a"); | ||
assert_eq!(at.read("x01"), "b"); | ||
assert_eq!(at.read("x02"), "c"); | ||
assert_eq!(at.file_exists("x03"), false); | ||
} | ||
|
||
/// Test for the last effective suffix, ignoring all others - numeric long last | ||
/// Any combination of short and long (as well as duplicates) should be allowed | ||
#[test] | ||
fn test_effective_suffix_numeric_last() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
ucmd.args(&[ | ||
"-n", | ||
"4", | ||
"--numeric-suffixes=7", | ||
"--hex-suffixes=4", | ||
"-d", | ||
"-x", | ||
"--numeric-suffixes=9", | ||
"threebytes.txt", | ||
]) | ||
.succeeds() | ||
.no_stdout() | ||
.no_stderr(); | ||
assert_eq!(at.read("x09"), "a"); | ||
assert_eq!(at.read("x10"), "b"); | ||
assert_eq!(at.read("x11"), "c"); | ||
assert_eq!(at.read("x12"), ""); | ||
} | ||
|
||
/// Test for the last effective suffix, ignoring all others - hex long last | ||
/// Any combination of short and long (as well as duplicates) should be allowed | ||
#[test] | ||
fn test_effective_suffix_hex_last() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
ucmd.args(&[ | ||
"-n", | ||
"4", | ||
"--hex-suffixes=7", | ||
"--numeric-suffixes=4", | ||
"-x", | ||
"-d", | ||
"--hex-suffixes=9", | ||
"threebytes.txt", | ||
]) | ||
.succeeds() | ||
.no_stdout() | ||
.no_stderr(); | ||
assert_eq!(at.read("x09"), "a"); | ||
assert_eq!(at.read("x0a"), "b"); | ||
assert_eq!(at.read("x0b"), "c"); | ||
assert_eq!(at.read("x0c"), ""); | ||
} | ||
|
||
#[test] | ||
fn test_round_robin() { | ||
let (at, mut ucmd) = at_and_ucmd!(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each argument should also override with the corresponding option right? Maybe we could also add some tests for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, tests are actually covering that, but will refactor with .overrides_with_all() instead