Mixed line endings not ignored when comparing hashes #247

rfbain · 2016-04-19T17:55:08Z

This is a follow up to issue: Ignore EOL changes when checking whether script has changed. There is still a problem when the original text has a mix of windows and unix line endings. An attempt is made to convert the new text to all windows line endings and all unix line endings, but neither one matches the original with mixed line endings. I'd like to see a command line switch that lets the user choose to set line endings to all windows or to all unix prior to computing the hash which gets saved to table ScriptsRun. Note, I'm only asking for the hash to be changed to enable future checks to work correctly. I would like the original version of the text to be saved in column text_of_script. If others would like the text to be updated as well, perhaps that could use different values for the switch.

ferventcoder · 2016-04-19T18:51:13Z

So I'm not fully following. If we are ignoring the \r\n / \n line endings for comparison, what does it matter what it is stored as?

ferventcoder · 2016-04-19T18:52:04Z

As in, I'm pretty sure once the ignore is in, line endings would no longer be a factor.

rfbain · 2016-04-19T19:10:58Z

Perhaps an example will help. Think of the following being included in a sql script.

-- this line was copied from a unix source.\n
-- this line was copied from a windows source.\r\n

Then someone looks at the file in a windows editor and saves it. Now it becomes.

-- this line was copied from a unix source.\r\n
-- this line was copied from a windows source.\r\n

Function have_same_hash_ignoring_platform in class DefaultDatabaseMigrator will show the file as having been changed and the script will either be run again, or trigger an error or warning when the script should be skipped.

ferventcoder · 2016-04-19T21:24:42Z

Again, I'm not sure it would be a factor, that's what ignoring the line endings is about. It doesn't matter if the line ending is windows or unix format, or even older Mac format with \r, the line endings are completely ignored with the ignore line endings code. If they are not, it's a bug in the code itself.

rfbain · 2016-04-20T17:31:15Z

The enclosed zip file has 4 batch files, and the logs produced by executing those files. Some sql scripts have lines that consistently end in CrLf. Some have lines that consistently end in Lf. And some have lines with a mix of line endings. You should be able to tell which is which from the filenames. Batch files CrLf_Lf and Lf_CrLf run as expected. This shows line endings are ignored when the line endings in individual files are consistent, even if they differ from one file to another. Batch files CrLf_Mixed and Mixed_Crlf show the bug when line endings differ in individual files.
rh_issue.zip

ferventcoder · 2016-04-20T17:32:59Z

Curious though, has the other feature actually been released so you can test this?

ferventcoder · 2016-04-20T17:39:07Z

Looking at #104 and the ReadMe, it looks like it was released in 0.8.6 - however looking at https://github.com/chucknorris/roundhouse/pull/104/files, it may have a possible bug. I would have expected it to just convert all line endings to one so they would be ignored.

rfbain · 2016-04-20T17:41:49Z

It comes down to this function, in class DefaultDatabaseMigrator in version 0.8.6.0 which I got from the site a few days ago. I have not looked to see if this has been updated in the unreleased versions.
private bool have_same_hash_ignoring_platform(string sql_to_run, string old_text_hash)
{
// check with unix and windows line endings
const string line_ending_windows = "\r\n";
const string line_ending_unix = "\n";
string new_text_hash = create_hash(sql_to_run.Replace(line_ending_windows, line_ending_unix));
bool hash_is_same = hashes_are_equal(new_text_hash, old_text_hash);

        if (!hash_is_same)
        {
            // try other way around
            new_text_hash = create_hash(sql_to_run.Replace(line_ending_unix, line_ending_windows));
            hash_is_same = hashes_are_equal(new_text_hash, old_text_hash);
        }

        return hash_is_same;
    }

create_hash(sql_to_run.Replace(line_ending_unix, line_ending_windows));

will produce a mix of \r\r\n and \r\n when sql_to_run has a mix of line endings, explaining why there are problems when the new file has inconsistent endings. If the original file had inconsistent endings, the hash will be based on those endings, so neither substitution will match.

ferventcoder · 2016-04-20T17:42:56Z

Right - so it's a bug in the code and the feature.

jftaylorMn · 2016-04-20T17:52:04Z

I think there are two related bugs.

The first is documented above. The code correctly tests if the EOL is unix or windows. However, it returns a false positive when there are both line types intermixed in the same file. The first time through, everything works fine because everything is new. After the first pass, mixed EOL files will consistently be reported as changed by have_same_hash_ignoring_platform even if they haven't been touched.

The second bug relates to the hash that is saved and subsequently passed to the above method. We have seen cases where mixed EOL files are repaired "automatically" - by just re-saving a file opened in an editor for example. Some evidence suggests that simply pulling the code from the repo using VS team explorer might actually trigger the "fix". The developer had no intention to make any change, and the SQL database wouldn't detect a change. In order to consistently ignore any EOL changes, the hash that is saved to the database needs to be the one for the EOL-agnostic version of the script - not what is in the current file. I am hoping that after this bug is fixed the -baseline switch could be used to "true up" the hash values in the database without actually executing the SQL code. I can see where this might be a controversial/breaking change so would entertain the introduction of a new switch to enforce this behavior.

rfbain · 2016-04-20T17:52:06Z

I'm new to this forum and not sure about the next steps. Now that the roundhouse team is aware of the issue, should I leave it in their hands? Or is there more I should do? My original post requested a switch to enable the line endings to be normalized prior to computing the hash to be saved into the database. My testing in the past day shows it would be helpful to edit the function listed above to first change crlf to lf and then change lf to crlf prior to computing the has for the sql_to_run variable.

rfbain · 2016-04-20T17:55:41Z

I tried not to bring other applications into the discussion, but yes, this issue was noticed in relation to using git with Visual Studio and performing merges. That's a bit much to introduce into a discussion like this one in a reproducible way.

ferventcoder · 2016-04-20T17:58:09Z

It goes to the backlog as a bug.

ferventcoder · 2016-04-20T17:59:10Z

Then it gets prioritized and fixed - someone from the community could also provide a PR to fix this behavior.

rfbain · 2016-04-20T20:16:53Z

I'm not ready to enter this as a pull request without real world testing, but I'll mention this solved all of the test cases I entered in my attachment above. It did not require any new switches. I changed function hash in class MD5CryptographicService as listed below. I did not change function have_same_hash_ignoring_platform. However, if this change gets accepted, the line ending changes and multiple hash comparisons in function have_same_hash_ignoring_platform become unnecessary.

    public string hash(string clear_text_of_what_to_hash)
    {
        // roundhouse version returned hash based on source line endings
        // custom returns hash based on all line endings being unix line endings

        // start of custom addition
        const string line_ending_windows = "\r\n";
        const string line_ending_unix = "\n";
        const string line_ending_mac = "\r";
        string normalized_text_of_what_to_hash = clear_text_of_what_to_hash
            .Replace(line_ending_windows, line_ending_unix).Replace(line_ending_mac, line_ending_unix);
        // end of custom addition

        // byte[] clear_text_bytes = Encoding.UTF8.GetBytes(clear_text_of_what_to_hash); // removed in custom code
        byte[] clear_text_bytes = Encoding.UTF8.GetBytes(normalized_text_of_what_to_hash); // replacement in custom code

        byte[] cypher_bytes = crypto_provider.ComputeHash(clear_text_bytes);
        return Convert.ToBase64String(cypher_bytes);
    }

ferventcoder · 2016-04-21T13:41:25Z

This is more along the lines of what I would have wanted for the earlier fix.

BiggerNoise · 2017-10-11T17:50:42Z

This should be addressed by my recent changes to hash computation (on master or the next release). Please reopen with details if what I have doesn't get it done for you.

ferventcoder added Enhancement 0 - _Triaging labels Apr 19, 2016

ferventcoder added Bug 0 - Backlog v0.8.7 Up For Grabs and removed 0 - _Triaging Enhancement labels Apr 20, 2016

rfbain mentioned this issue Apr 22, 2016

Normalizing line endings prior to computing hash. #248

Closed

BiggerNoise closed this as completed Oct 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixed line endings not ignored when comparing hashes #247

Mixed line endings not ignored when comparing hashes #247

rfbain commented Apr 19, 2016

ferventcoder commented Apr 19, 2016 •

edited

Loading

ferventcoder commented Apr 19, 2016

rfbain commented Apr 19, 2016

ferventcoder commented Apr 19, 2016

rfbain commented Apr 20, 2016

ferventcoder commented Apr 20, 2016

ferventcoder commented Apr 20, 2016

rfbain commented Apr 20, 2016

ferventcoder commented Apr 20, 2016

jftaylorMn commented Apr 20, 2016

rfbain commented Apr 20, 2016

rfbain commented Apr 20, 2016

ferventcoder commented Apr 20, 2016

ferventcoder commented Apr 20, 2016

rfbain commented Apr 20, 2016

ferventcoder commented Apr 21, 2016

BiggerNoise commented Oct 11, 2017

Mixed line endings not ignored when comparing hashes #247

Mixed line endings not ignored when comparing hashes #247

Comments

rfbain commented Apr 19, 2016

ferventcoder commented Apr 19, 2016 • edited Loading

ferventcoder commented Apr 19, 2016

rfbain commented Apr 19, 2016

ferventcoder commented Apr 19, 2016

rfbain commented Apr 20, 2016

ferventcoder commented Apr 20, 2016

ferventcoder commented Apr 20, 2016

rfbain commented Apr 20, 2016

ferventcoder commented Apr 20, 2016

jftaylorMn commented Apr 20, 2016

rfbain commented Apr 20, 2016

rfbain commented Apr 20, 2016

ferventcoder commented Apr 20, 2016

ferventcoder commented Apr 20, 2016

rfbain commented Apr 20, 2016

ferventcoder commented Apr 21, 2016

BiggerNoise commented Oct 11, 2017

ferventcoder commented Apr 19, 2016 •

edited

Loading