-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use String::with_capacity
in format!
#39356
Changes from 1 commit
e74b55b
7403ee9
ecda7f3
0267529
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -265,6 +265,34 @@ impl<'a> Arguments<'a> { | |
args: args | ||
} | ||
} | ||
|
||
/// Estimates the length of the formatted text. | ||
/// | ||
/// This is intended to be used for setting initial `String` capacity | ||
/// when using `format!`. Note: this is neither the lower nor upper bound. | ||
#[doc(hidden)] #[inline] | ||
#[unstable(feature = "fmt_internals", reason = "internal to format_args!", | ||
issue = "0")] | ||
pub fn estimated_capacity(&self) -> usize { | ||
// Using wrapping arithmetics in this function, because | ||
// wrong result is highly unlikely and doesn't cause unsafety. | ||
use ::num::Wrapping as W; | ||
|
||
let pieces_length: W<usize> = self.pieces.iter() | ||
.map(|x| W(x.len())).sum(); | ||
|
||
// If they are any arguments to format, the string will most likely | ||
// double in size. So we're pre-doubling it here. | ||
let multiplier = if self.args.is_empty() { W(1) } else { W(2) }; | ||
|
||
let capacity = multiplier * pieces_length; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why's this multiplying by the number of pieces? Those all have a statically known size, and we're guessing to the size of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's multiplying overall length of all pieces by either 1 or 2. The idea is that if there are any arguments, the string buffer will grow on the next push. To avoid this reallocation, we're pre-doubling the capacity. I've changed the code so it's more clear. Should I change |
||
if multiplier == W(2) && (W(1)..W(8)).contains(capacity) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is |
||
// Allocations smaller than 8 don't really make sense for String. | ||
8 | ||
} else { | ||
capacity.0 | ||
} | ||
} | ||
} | ||
|
||
/// This structure represents a safely precompiled version of a format string | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,3 +28,11 @@ fn test_pointer_formats_data_pointer() { | |
assert_eq!(format!("{:p}", s), format!("{:p}", s.as_ptr())); | ||
assert_eq!(format!("{:p}", b), format!("{:p}", b.as_ptr())); | ||
} | ||
|
||
#[test] | ||
fn test_estimated_capacity() { | ||
assert_eq!(format_args!("{}", "").estimated_capacity(), 0); | ||
assert_eq!(format_args!("Hello").estimated_capacity(), 5); | ||
assert_eq!(format_args!("Hello, {}!", "").estimated_capacity(), 16); | ||
assert_eq!(format_args!("{}, hello!", "World").estimated_capacity(), 16); | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add a test for format_args!("{} {} {}", "12characters", "13_characters", "14__characters") I am curious what estimated capacity will. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also measure time. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The previous algorithm would set the capacity to 8 (two spaces times two, rounded up to eight). I've updated the algorithm so it doesn't try to be smart in your case and also added some more benchmarks.
You mean to add the benches to repo? I'm not sure where exactly they belong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is wrapping really necessary here? If we're wrapping around the max value of
usize
we surely want to just skip estimation entirely, right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we'd really wrap, then either the program would fail from OOM anyway, or we'd just underestimate the capacity (in case of multiplication by 2). I've used
Wrapping
to avoid sprinkling the code with early returns, so it's easier to read.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was just struck by the oddity of reaching for a wrapping type. If we wanted to be "correct" I'd expect saturating arithmetic here rather than wrapping. In both cases though something has gone seriously wrong if we wrap at all.
In that sense I'd probably argue that normal arithmetic is what I'd expect here: panic in debug and don't worry about it in optimized mode.