FormatWriter: simplify two regular expressions; fix pipe char bug #1936
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The primary purpose of this PR is simplify two regular expressions
(regexen). One used by
leadingAsteriskSpace
and the other bycompileStripMarginPattern
.Discussions about readability are always subjective. I believe
that removing the positive and negative lookaheads makes it
more likely that the next maintainer will be able to read and
comprehend the code.
Discussions about runtime performance improvements without
demonstrable data are worth than useless. lookaheads are
known to have a high runtime cost. It is believable, but not
asserted by this PR for lack of evidence, that the code of this
PR will have better performance characteristics.
Unicode whitespace handling was added, along with test cases.
This PR supersedes the Work In Progress PR WIP: internal/FormatWriter simplify two regexen; fix pipe char bug #1924. It has
benefited greatly from discussions and multiple rounds of review there.
My thanks and appreciation to the patient reviewers.
Whilst simplifying the two regex expressions, I noticed what I
believe to be a bug in
compileStripMarginPattern
. Scala allowsany character other than NULL to be used as a pipe character.
Java requires that Dollar ($) and backslash () characters
in the replacement string be quoted (e.g. in a string: "\$").
The code prior to this PR did not do this.
Previous regexen precluded Scala Native support. The ones of this
PR do not. Scala Native does not yet support "\h" but it is
believable that such support could be added.
The handling of
compileStripMarginPattern
is a bit tricky.Every/any regex expression should be sealed with the Curse of the Mummy,
but that one is particularly deserving.
The active pipe character is not in scope where the replaceAll() is done.
Using a capture group allows the pipe character from where the pattern
is compiled to be used in replaceAll(),
I did not test it but the
\n
approach should work on the/most Windowsoperating system. It is what has been in the scalafmt code for years.
Documentation:
The tab and Unicode whitespace changes deserve a quick release note as they change the
behavior seen by end users.
I will check the Contributing section of the web site to see if I can figure out what needs to
be done.
Testing:
using sbt 1.3.10 & Java 8 on X86_64 only. All tests pass.