Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix relative path issues on Windows #96

Closed
wants to merge 5 commits into from
Closed

Fix relative path issues on Windows #96

wants to merge 5 commits into from

Conversation

cstavro
Copy link

@cstavro cstavro commented Jan 10, 2017

Fixes #89

Chris Stavropoulos added 2 commits January 10, 2017 12:43
This will standardize Windows paths and help prevent escaping issues in the HCL parser
Copy link
Member

@brikis98 brikis98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I'm not sure this is the best way to fix it and I left a comment explaining a better approach in the PR.

Also, could you copy/paste the output of running all the tests? We can't run tests in a CI job for PRs due to security concerns.

@@ -26,7 +26,7 @@ func TestPathRelativeToInclude(t *testing.T) {
"child",
},
{
&IncludeConfig{Path: "/root/.terragrunt"},
&IncludeConfig{Path: rootFolder + "/.terragrunt"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several other uses of /root in this file. Would you mind updating them the same way? That will help prevent these sorts of bugs in the future.

@@ -83,6 +84,7 @@ func GetPathRelativeTo(path string, basePath string) (string, error) {
return "", errors.WithStackTrace(err)
}

relPath = strings.Replace(relPath, `\`,"/", -1)
Copy link
Member

@brikis98 brikis98 Jan 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm not sure this is the right place for the fix, nor quite the right fix. The bug, as explained here, is caused by the fact that Windows uses backslashes in its paths, but backslashes mean "escape" in HCL. So the fix is not to replace backslashes, but to escape them properly so they work in HCL.

Therefore, I think the proper fix is:

  1. Create a method called like cleanPathForHcl(path string) that escapes backslashes (i.e. turns them into a double backslash). Add a comment with the explanation above.
  2. Call that method at the end of the findInParentFolders and pathRelativeToInclude methods, since those methods are explicitly built to return valid HCL values.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the problem this causes is that all the tests are using hard coded strings with / as the path separator. So this is going to require either some sort of a per platform mapping for test values (seems excessive), or a conversion from \ to / for the purposes of the test anyway (seems less than ideal).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, pathRelativeToInclude can actually call findInParentFolders indirectly through ResolveTerragruntConfigString. So simple escaping is actually double escaping in many scenarios.

Copy link
Member

@brikis98 brikis98 Jan 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be possible to add a helper to test_constants_unix.go like this:

func cleanPath(path string) string {
  return strings.Replace(path, "\\", "/", -1)
}

And in test_constants_windows:

func cleanPath(path string) string {
  return strings.Replace(path, "/", "\\", -1)
}

(Perhaps a better name would be test_helpers_unix.go and test_helpers_windows.go.)

You can then wrap all the paths in the tests with a call to cleanPath and it should take care of the cross-platform issues...

@cstavro
Copy link
Author

cstavro commented Jan 10, 2017

I'll go through this in more detail.
Running all the tests is a bit of a gotcha though. Since you guys don't run your tests on Windows at all, you may not realize that there are in fact a large number of failing tests!
I don't think it's a good idea to try and fix them all in this PR.

@brikis98
Copy link
Member

Since you guys don't run your tests on Windows at all, you may not realize that there are in fact a large number of failing tests!

Sadly, that's true. Do you know of any free CI service similar to CircleCI that works on Windows?

I don't think it's a good idea to try and fix them all in this PR.

Agreed. We can file separate bugs for unrelated test failures.

@cstavro
Copy link
Author

cstavro commented Jan 10, 2017

I use CircleCI for my own small projects which tend to be Node based, so no. But a quick search suggests AppVeyor may fit the bill for you. https://www.appveyor.com/

@brikis98
Copy link
Member

Ah, appveyor looks promising. I filed #97 to track this.

…nd path_relative_to_include interpolation functions
return cleanPathForHcl( path ), err
case "path_relative_to_include":
path, err := pathRelativeToInclude(include, terragruntOptions)
return cleanPathForHcl( path ), err
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this like this to make sure we don't double escape as can happen when path_relative_to_include calls find_in_parent_folders

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. That seems like a good approach.

@cstavro
Copy link
Author

cstavro commented Jan 10, 2017

I'm working through trying to clean up some of the tests around this as best I can find.

There is still some double escaping going on somewhere.

=== RUN TestResolveTerragruntInterpolation
--- FAIL: TestResolveTerragruntInterpolation (0.00s)
Error Trace: config_helpers_test.go:202
Error: Not equal:
expected: "child/sub-child"
received: "child\\sub-child"
Messages: For string '${path_relative_to_include()}' include &{${find_in_parent_folders()}} and options {../test/fixture-parent-folders/terragrunt-in-root/child/sub-child/.terragrunt true [] }
Error Trace: config_helpers_test.go:202
Error: Not equal:
expected: "../../.terragrunt"
received: "..\\..\\.terragrunt"
Messages: For string '${find_in_parent_folders()}' include and options {../test/fixture-parent-folders/terragrunt-in-root/child/sub-child/.terragrunt true [] }
FAIL

There is also some weirdness going on with the locking state file id generation. It looks like it's being automatically generated based on the path to the .terragrunt file but I haven't been able to track that down exactly yet.

=== RUN TestParseTerragruntConfigInclude
--- FAIL: TestParseTerragruntConfigInclude (0.00s)
Error Trace: config_test.go:278
Error: Not equal:
expected: "child/sub-child/sub-sub-child"
received: "child\sub-child\sub-sub-child"
Error Trace: config_test.go:285
Error: Not equal:
expected: "child/sub-child/sub-sub-child/terraform.tfstate"
received: "child\sub-child\sub-sub-child/terraform.tfstate"
FAIL

I will pick this up again tomorrow, unless someone else beats me to it!

@brikis98
Copy link
Member

There is still some double escaping going on somewhere.

Hm, the .terragrunt processing does multiple passes, so my best guess is that the cleanPathForHcl function is being run multiple times, and thereby, escaping things multiple times. Perhaps that function should only escape single slashes, while leaving double slashes unchanged?

It looks like it's being automatically generated based on the path to the .terragrunt file but I haven't been able to track that down exactly yet.

Yes, that's the purpose of the path_relative_to_include helper.

@cstavro
Copy link
Author

cstavro commented Jan 10, 2017

Yes, that's the purpose of the path_relative_to_include helper.

This is probably going to need to be made platform agnostic so that we ensure different platforms are using the same generated state file id. Worst case, someone runs an apply on Linux and an apply on Windows in parallel and they clobber one another because the dynamically generated state file id doesn't match cross-platform.

Does that fit within the spirit of this PR? or is that a separate issue?

@cstavro
Copy link
Author

cstavro commented Jan 10, 2017

Hm, the .terragrunt processing does multiple passes, so my best guess is that the cleanPathForHcl function is being run multiple times, and thereby, escaping things multiple times. Perhaps that function should only escape single slashes, while leaving double slashes unchanged?

So I was thinking something similar, I'm just concerned that this will add an edge case where someone does intentionally add double backslashes. Perhaps in a UNC path or something?
I feel like if we're okay with that, we should probably have just stuck with my original solution to convert backslashes to slashes and be done with this. 😉

@brikis98
Copy link
Member

This is probably going to need to be made platform agnostic so that we ensure different platforms are using the same generated state file id. Worst case, someone runs an apply on Linux and an apply on Windows in parallel and they clobber one another because the dynamically generated state file id doesn't match cross-platform.

Good point!

Does that fit within the spirit of this PR? or is that a separate issue?

Totally up to you. I'm happy to have it as part of this PR if you have time for it. If not, we can file a separate issue for it.

So I was thinking something similar, I'm just concerned that this will add an edge case where someone does intentionally add double backslashes. Perhaps in a UNC path or something?

The only place we are doing this escaping is in the Terragrunt helpers. Neither one, as far as I can tell, should be subject to anything like that.

This should fixe all relevant config path tests on Windows
@cstavro
Copy link
Author

cstavro commented Jan 11, 2017

Test results
test_output.txt

@cstavro
Copy link
Author

cstavro commented Jan 11, 2017

This should fix all the config path issues on Windows.
Might want to double check the test helper functions on the Linux side as I wasn't able to test those myself.
There are still some problems that need to be addressed separately.

  1. The state file id generation based on the relative path helpers is platform specific and that's a problem.
  2. I highly recommend moving the tests to use local dynamodb rather than requiring an actual AWS account. Tests would run faster and free....er.
  3. The spin-up/teardown commands are going to need a similar bit of love to make them Windows compatible as well. (which I will probably look at in the near future as I'd like to be able to use that functionality)

@brikis98
Copy link
Member

test_output.txt

I'm still seeing quite a few test failures there. Is that the most recent run?

The state file id generation based on the relative path helpers is platform specific and that's a problem

I filed #99 to track this.

I highly recommend moving the tests to use local dynamodb rather than requiring an actual AWS account. Tests would run faster and free....er.

That's not a bad idea, but the point of the DynamoDB usage is to check that the code is properly enforcing locking in the face of concurrent users. I worry that the local version won't catch the same bugs since it's inherently not a distributed system (distributed lock).

The spin-up/teardown commands are going to need a similar bit of love to make them Windows compatible as well.

What issues are you having with those?

@cstavro
Copy link
Author

cstavro commented Jan 12, 2017

Yes, lots of errors on Windows. There were even more before I got started. ;)
Let me try to summarize them for you:

TestParseTerragruntConfigInclude
State file id mismatch you're tracking in #99

TestParseTerragruntConfigIncludeOverrideRemote
Remote state path issue. S3 remote state should probably use forward slashes

TestParseTerragruntConfigIncludeWithFindInParentFolders
State file id mismatch

TestFindStackInSubfolders
This is a Windows relative path issue with the finding child stacks with the spin-up/teardown commands. Probably need to create an issue to track this as well.

TestTerragruntSpinUpAndTearDown
Looks like a pathing issue with the integration test, not entirely sure what's up

The concurrency/locking errors I don't know what's going on but I'm relatively confident they're not related to these changes. I'm pretty sure that if you run this on a Linux box, the errors would disappear.

I'm also suggesting you reconsider my original fix which was to convert backslashes to forward slashes. It was much simpler. :) HCL happily parses both, so converting to the slash seems like a trivial solution that would integrate much more nicely. I agree I may not have put it in quite the right place originally, but I'm a bit more familiar with things now!

That's not a bad idea, but the point of the DynamoDB usage is to check that the code is properly enforcing locking in the face of concurrent users. I worry that the local version won't catch the same bugs since it's inherently not a distributed system (distributed lock).

So I haven't dug into these tests at all, but I'm going to suggest that if you're relying on random concurrency errors via the remote dynamodb endpoint to validate your concurrency code, that's probably a bad idea. You should be able to simulate the intended behaviours against the local dynamodb codebase. Perhaps another issue you can track to validate that the concurrency tests are actually behaving as expected?

I'm happy to continue looking into some of these issues as Terragrunt is definitely moving in the direction I want to see it go!

@brikis98
Copy link
Member

TestParseTerragruntConfigIncludeOverrideRemote
Remote state path issue. S3 remote state should probably use forward slashes

I think this one is simpler. The check that's failing is this one:

assert.Equal(t, "child/sub-child/sub-sub-child", lock.StateFileId)

Those forward slashes should be adjusted accordingly depending on the OS.

TestParseTerragruntConfigIncludeWithFindInParentFolders
State file id mismatch

Exact same issue.

TestFindStackInSubfolders
This is a Windows relative path issue with the finding child stacks with the spin-up/teardown commands. Probably need to create an issue to track this as well.

Hmm, the error I'm seeing looks like another test case hard-coding forward slashes. That test case has the following definition:

filePaths := []string{
  "/stage/data-stores/redis/.terragrunt",
  "/stage/data-stores/postgres/.terragrunt",
  "/stage/ecs-cluster/.terragrunt",
  "/stage/kms-master-key/.terragrunt",
  "/stage/vpc/.terragrunt",
}

Further down, it then expects to find each of these:

for _, filePath := range filePaths {
  filePathFound := util.ListContainsElement(modulePaths, filePath)
  assert.True(t, filePathFound, "The filePath %s was not found by Terragrunt.\n", filePath)
}

If the modulePaths variable contains forward slashes, the above test will fail. Tweaking the filePaths variable with forward slashes or back slashes according to the OS may fix this test case.

TestTerragruntSpinUpAndTearDown
Looks like a pathing issue with the integration test, not entirely sure what's up

My best guess is that this is caused by the same issue as #99. The state for one of the modules is stored in S3 with back slashes (since its Windows), but the Terraform code is using a terraform_remote_state resource that's looking up the path with forward slashes. I think once #99 is fixed by standardizing on backslashes, this should work fine.

Yes, lots of errors on Windows. There were even more before I got started. ;)

Thank you again for taking a crack at this!

I'm also suggesting you reconsider my original fix which was to convert backslashes to forward slashes. It was much simpler. :) HCL happily parses both, so converting to the slash seems like a trivial solution that would integrate much more nicely.

The reason I don't like that solution is that we use those paths to load files from the file system. If we get the slashes wrong, then things may not load correctly. Unless, of course, Windows doesn't care about which way the slash goes and will work with either type of path (e.g. C:/foo/bar and C:\foo\bar are equivalent)?

So I haven't dug into these tests at all, but I'm going to suggest that if you're relying on random concurrency errors via the remote dynamodb endpoint to validate your concurrency code, that's probably a bad idea.

Terragrunt uses DynamoDB as a distributed lock. We have to be sure that this lock actually locks, so that even if two (or twenty) team members run terragrunt apply at the exact same time, only one of them gets the lock at a time. The test cases that hit DynamoDB try to acquire the lock with tons of concurrency as a reasonable sanity check that this works. This isn't exactly a formal proof that the locking works, but it gives us reasonable confidence that we are using DynamoDB's strongly consistent reads and conditional update functionality correctly.

I worry that if we switch to a local version, then those tests won't really be verifying that functionality, as the local version may have different locking/concurrency semantics, even if they guarantee it exposes the same API.

@brikis98
Copy link
Member

I'm going to close this one, as #100 fixes all these issues.

@brikis98 brikis98 closed this Jan 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants