Skip to content

Commit

Permalink
wc: streaming --files0-from and other improvements
Browse files Browse the repository at this point in the history
My original focus was on --files0-from, which should be processed as a
stream rather than consumed into a list of files before processing. I
accomplished this by separating most of the runtime configuration from
arguments from definition of which files should be processed.
A `Settings` now tracks which of `-[clLmw]` are specified. An `Inputs`
describes whether stdin is implied by no arguments, if --files0-from is
specified, or if there is a list of one or more files provided.
`Inputs::try_iter` will create an `Iterator` that will handle any of
those cases. Each `Input` (singular!) yielded is either stdin or a file
name.  --files0-from will now support non-UTF-8 filenames on Unix.

Secondarily, I have attempted to reduce the number of String allocations
that occur while printing the results. Now, unless a file name needs
escaping, or an error occurs, no additional allocations should be
necessary to print results. `print_stats` was the biggest abuser,
allocating a `String` for each column before before being `join`'d into
a `String` for the whole line.  The `TitledWordCount` type is not
necessary at all, which was another source of `String` allocation.

I've made some effort to make more cases match GNU wc's output.  Errors
encountered processing `--files0-from` as well as any encountered
processing files listed on the command line will be escaped more
consistently like GNU wc.  File names printed to the right of their
stats will be less aggressively quoted, to match GNU wc, now only if
they are not UTF-8 or if they contain a newline.
  • Loading branch information
jeddenlea committed Mar 31, 2023
1 parent 5d0f014 commit 3ca7a1b
Show file tree
Hide file tree
Showing 6 changed files with 361 additions and 287 deletions.
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions src/uu/wc/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ path = "src/wc.rs"
clap = { workspace=true }
uucore = { workspace=true, features=["pipes"] }
bytecount = { workspace=true }
thiserror = { workspace=true }
utf-8 = { workspace=true }
unicode-width = { workspace=true }

Expand Down
1 change: 1 addition & 0 deletions src/uu/wc/src/countable.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ impl WordCountable for StdinLock<'_> {
self
}
}

impl WordCountable for File {
type Buffered = BufReader<Self>;

Expand Down
Loading

0 comments on commit 3ca7a1b

Please sign in to comment.