-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memoize text width #6552
Memoize text width #6552
Conversation
Current dependencies on/for this PR:
This comment was auto-generated by Graphite. |
a97a59a
to
3076d97
Compare
dae8c98
to
8a932fb
Compare
This comment was marked as outdated.
This comment was marked as outdated.
2db43a5
to
493a3b6
Compare
8a932fb
to
9b9db1b
Compare
9b9db1b
to
4ed9a28
Compare
4ed9a28
to
c878190
Compare
c878190
to
2aa6815
Compare
eb8d51f
to
fb43ed9
Compare
fc0d6aa
to
409f8f9
Compare
fb43ed9
to
81036a0
Compare
81036a0
to
6a31ef4
Compare
'\t' => tab_width.value(), | ||
'\n' => return TextWidth::Multiline, | ||
#[allow(clippy::cast_possible_truncation)] | ||
c => c.width().unwrap_or(0) as u32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does c
not have a width and why would we go with 0 instead, is this about control characters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the unicode width documentation
Returns the character's displayed width in columns, or None if the character is a control character other than '\x00'.
So yes, this is about control characters and using 0 seems reasonable to me (this is the same logic as applied by the printer today)
self.state.line_width = 0; | ||
continue; | ||
Text::Text { text, width } => { | ||
if let Some(width) = width.width() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i think i'd use a match here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too but our pedantic clippy rule doesn't allow me to
Sorry i had missed this PR. I'll try to trigger codspeed for perf numbers. |
Code speed only comments on regressions but you can see the results in the run summary https://github.com/astral-sh/ruff/pull/6552/checks?check_run_id=16438240046 |
6a31ef4
to
e09159a
Compare
/// This imprecision shouldn't matter in practice because either text are longer than any configured line width | ||
/// and thus, the text should break. | ||
#[derive(Copy, Clone, Debug, Eq, PartialEq)] | ||
pub struct Width(NonZeroU32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this newtype wrapper to lock down the access to the inner NonZeroU32
. Not that someone uses it and then gets values that are off by one.
The win is bigger for formats that a) Use more groups or best fitting elements because the printer has to measure |
Summary
The formatter processes strings a couple of times:
propagate_expands
: Test if the string contains a newline and, if so, setmode: Propagated
on all enclosing groupsprint_text
: Measures the width of the text to compute theline_width
fits_text
(only if the text is inside of a group): Measures the width of the text to determine if the content fits. Runsn
times wheren
is the number of parent groups.This PR adds a
TextWidth
field to all textFormatElement
s that either stores the computed width or that it is a multiline string.This reduces the traversal for non-multiline strings to exactly once. Multiline strings still get traversed multiple times and you could argue this PR makes it worse because the implementation now computes the text width only to throw it away when realizing it is a multiline string. However, multiline strings are rare.
Memoizing the text width improves performance by 2-12%.
The downsides of this change are:
Text
FormatElement
s because we are now very close to the 24 bytesTest Plan
The tests are currently failing because I need to update them to pass
TextWidth