Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harmonise line endings in baseline for multi-line code snippets #9239

Closed
jon48 opened this issue Feb 7, 2023 · 6 comments · Fixed by #9566
Closed

Harmonise line endings in baseline for multi-line code snippets #9239

jon48 opened this issue Feb 7, 2023 · 6 comments · Fixed by #9566

Comments

@jon48
Copy link

jon48 commented Feb 7, 2023

Hello,
This is bug very similar to #5979, but contrary to this initial issue, my problems comes when the code element of the baseline contains a multi-line snippet. The fix was to trim the code content (that was relocated as part of the 5.x upgrade, but the idea remains the same), but that would remove only the characters at the beginning or the end of the string, but do not cater for potential carriage returns and line feeds inside the string.
So, depending on whether CRLF or LF mode is used, when the baseline is generated and/or compared, there may be respectively additional or missing 
 characters in the baseline file, that means that errors are not recognised as part of the baseline if the environment do not use the same line ending. In my case, my dev environment is Windows, using CRLF, where the baseline is produced and I can get the "No errors found!" result, whilst the GitHub actions use a Linux environment, and LF mode, and fail (on a message not clear at all, as it seems identical to the baseline).

I feel that the in the line below, there should be an harmonisation of the line endings, by maybe dropping the 
 entity if present, so that CRLF will be harmonised to LF. Of course, this is not the only change that would be required, and that may
need to wait Psalm 6, or have an alternative solution, as this is maybe a breaking change.

$files[$fileName][$issueType]['s'][] = trim($codeSample->textContent);

I have tried to showcase it in the small example attached. (hopefully, it will make sense).

Using Psalm 5.6.0, I have created the same code twice, once in a file with CRLF endings, one with LF endings, to simulate the two environments. The anonymous function will of course raise an error, as an int is expected rather than bool.

<?php
$test = ['b', 'a'];
usort($test, function (mixed $a, mixed $b): bool {
    return $a <=> $b;
});
print_r($test);

Taking the CRLF as reference for the baseline, it generates the below (which I manually duplicate for test.LF.php). Please note the &#13; entities to encode the carriage return; the line feed is not encoded, and is a line break in the file.

<?xml version="1.0" encoding="UTF-8"?>
<files psalm-version="5.6.0@e784128902dfe01d489c4123d69918a9f3c1eac5">
  <file src="test.CRLF.php">
    <InvalidArgument>
      <code>function (mixed $a, mixed $b): bool {&#13;
    return $a &lt;=&gt; $b;&#13;
}</code>
    </InvalidArgument>
    ...
  </file>
</files>

Then, I run for the CRLF environment:

Running on CRLF environment
$ psalm --version
Psalm 5.6.0@e784128902dfe01d489c4123d69918a9f3c1eac5

$ psalm --config=psalm.CRLF.xml
Warning: "findUnusedCode" will be defaulted to "true" in Psalm 6. You should explicitly enable or disable this setting.
Target PHP version: 8.1 (inferred from current PHP version) Enabled extensions: .
Scanning files...
Analyzing files...

░

------------------------------

       No errors found!

------------------------------
3 other issues found.
You can display them with --show-info=true
------------------------------
Psalm can automatically fix 1 of these issues.
Run Psalm again with
--alter --issues=InvalidReturnType --dry-run
to see what it can fix.
------------------------------

Checks took 1.63 seconds and used 111.493MB of memory
Psalm was able to infer types for 100% of the codebase

And for the LF environment, using the same baseline file:

Running on LF environment

$ psalm --config=psalm.LF.xml
Warning: "findUnusedCode" will be defaulted to "true" in Psalm 6. You should explicitly enable or disable this setting.
Target PHP version: 8.1 (inferred from current PHP version) Enabled extensions: .
Scanning files...
Analyzing files...

░

ERROR: InvalidArgument - test.LF.php:5:14 - Argument 2 of usort expects callable(mixed, mixed):int, but pure-Closure(mixed, mixed):bool provided (see https://psalm.dev/004)
usort($test, function (mixed $a, mixed $b): bool {
    return $a <=> $b;
});


------------------------------
1 errors found
------------------------------
2 other issues found.
You can display them with --show-info=true
------------------------------
Psalm can automatically fix 1 of these issues.
Run Psalm again with
--alter --issues=InvalidReturnType --dry-run
to see what it can fix.
------------------------------

Checks took 1.64 seconds and used 111.494MB of memory
Psalm was able to infer types for 100% of the codebase

The LR environment throws an error (with a "displayed" code identical to the one in the baseline, but the line endings are actually different), whereas the CRLF completes fine. I would expect both environments to complete successfully with the same baseline, as the code is seemingly the same.

@psalm-github-bot
Copy link

Hey @jon48, can you reproduce the issue on https://psalm.dev ?

@jon48
Copy link
Author

jon48 commented Feb 7, 2023

This cannot be reproduced there, as there is no baseline capability, nor CRLF/LF setting.
An example has been provided as a ZIP.

@weirdan
Copy link
Collaborator

weirdan commented Feb 7, 2023

The underlying issue is with git and its conversion of line endings. This can be turned off on Windows, but it's rarely done, even though it's the most sensible solution.

@jon48
Copy link
Author

jon48 commented Feb 8, 2023

Yes, I am not denying that, and as you are saying the Windows installer for git set autocrlf to true by default, so this is a setting very easily overlooked. Added to that, some of the IDE/editors would also default to CRLF line endings on Windows (which again, can be changed).

My suggestion was only to try and normalise the CRLF to LF when creating and comparing the baseline, so that the result do not depend on the user having chosen all the correct settings to use LF in a Windows environment.
Or, otherwise, maybe give a CLI warning when code snippets in the baseline contains CR characters, as it took me quite a long time to realise why the apparently identical snippet in the terminal was failing on GitHub, and passing on my (lazily set up) local repository.

@kamil-tekiela
Copy link
Contributor

I am facing the same issue, but it's not because of GIT. I disabled autocrlf and the multiline entries are still encoded with &#13;. Is there any fix or monkeypatch I can apply locally?

@kamil-tekiela
Copy link
Contributor

I fixed it with How do I force Git to use LF instead of CR+LF under Windows? After you disable autocrlf you have to refresh the working tree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants