-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make comparison of newlines in text files more precise #2600
Make comparison of newlines in text files more precise #2600
Conversation
bee5fc3
to
bae1cf1
Compare
If newlines do not matter for the delta should then additional ones also don't matter? Can you explain in what case an additional newline is relevant? In the end the user can always bump the version anyways if it is considered a relevant change. |
boolean hasNext() { | ||
skipNewLines(); | ||
return index < bytes.length; | ||
// Both sliders are at either "\n" or "\r\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that a single \r
is a valid newline as well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right (although AFAICT not widely used), but the important part is that a single \r
is treated equally on Windows and Linux, especially by git.
Please have a look at the test-cases I created. If you think there is a case that is not covered, please let me know and the tests can be easily extended.
They do matter. At least from my point of view, the new line skip was introduced for text files to deal with the different default line-ending on Windows and Linux and because it is very common to use git's auto-crlf feature where on Windows on checks in Linux-style and checks-out windows style line-endings. And considering added/removed newlines is for example relevant if you adjust a doc or license file. Of course one can bump the version, but if that is not necessary I believe nobody thinks about that in advance and will just wonder why the changed doc/license file is not in the build jars, while they are present in the IDE. |
Maybe because its a strange license that is considered different because of more line endings ;-) Anyways it seems the testcases do not pass (anymore) with that change on some platforms. |
@@ -19,22 +19,22 @@ | |||
|
|||
public interface ArtifactComparator { | |||
|
|||
public static class ComparisonData { | |||
public static record ComparisonData(List<String> ignoredPattern, boolean writeDelta, boolean showDiffDetails) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this change can be a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Here we go: #2610
// Possible new lines: | ||
// \n -- unix style | ||
// \r\n -- windows style | ||
// \r -- whatever style (ignored?!) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to Wikipedia's Mac OS 9 article, the latest release of Mac OS 9 is 9.2.2 / December 5, 2001, which was 21 years ago. Therefore I think this is not really relevant anymore.
Nevertheless my intention was that this works as if one would use a BufferedReader
to read the lines one by one using BufferedReader.readLine() and compare the line content, just more efficient (i.e. without creating Strings all the time).
And the javadoc of BufferedReader.readLine() says:
Reads a line of text. A line is considered to be terminated by any oneof a line feed ('\n'), a carriage return ('\r'),
a carriage return followed immediately by a line feed, or by reaching the end-of-file(EOF).
So if equivalence is the goal, \r
needs to be considered.
...rtifactcomparator/src/main/java/org/eclipse/tycho/zipcomparator/internal/TextComparator.java
Outdated
Show resolved
Hide resolved
@HannesWell do you plan to work on this and possibly backport to Tycho 4? I currently plan to prepare a bugfix release so so we might want to include it there. |
c73bc38
to
d10aed2
Compare
I adjusted the logic so that \r now should be treated as regular line-ending and in the latest run all tests passed unchanged. #2610 should be merged before this |
d10aed2
to
5aef81f
Compare
Don't ignore all kind of newlines when comparing text-files, only treat exactly matching \n and \r\n as equals.
5aef81f
to
ccd6ce3
Compare
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation and see the Github Action logs for details |
At the moment the
TextComparator
ignores simplify all kind of newline characters (\r
and\n
), which has the effect that, although a new line to a text file was added or removed, the comparator considers that file as unchanged.This PR enhances the
TextComparator
to only ignore a difference in newline characters if there is is an exactly matching pair of\r
or\r\n
in baseline respectively reactor.This PR also adds tests for the
TextComparator
and moves theshowDiffDetails
property toComparisonData
(in the first commit) to simplify testing.