Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements on the ACLs and bug fixing #320

Merged
merged 30 commits into from
Feb 21, 2022

Conversation

restanrm
Copy link
Contributor

  • read the CONTRIBUTING guidelines
  • raised a GitHub issue or discussed it on the projects chat beforehand
  • added unit tests
  • [] added integration tests
  • updated documentation if needed
  • updated CHANGELOG.md

This PR is a first implementation of what has been discussed on #311 It should be reviewed commit by commit since last commits are linting modifications and changes quite much some part of the code that I didn't touch.

All subject discussed in PR #311 are not implemented here. I think all those modifications should be splitted in multiple PRs.

If this PR is too big, some commits could be moved outside of this PR it's related but just fixes some bugs on the ACLs parsing behavior.

@kradalby
Copy link
Collaborator

I will try to review this (probably in batches/rounds) throughout the week, but I would like to have the 0.13.0 features (#306 and #316) in first and aim for 0.14.0 for this.

@kradalby
Copy link
Collaborator

Can you have a look at the conflicts :)? all the other stuff is in now, and @juanfont will have a look at the proposal.

@restanrm
Copy link
Contributor Author

Sorry I didn't saw your comment I'll look at it tomorrow. I'll need to work on integration tests also.

madjam002 and others added 12 commits February 17, 2022 09:27
This commit change the default behaviour and remove the notion of namespaces between the hosts. It allows all namespaces to be only filtered by the ACLs. This behavior is closer to tailsnet.
This call should be done quite at each modification of a server resources like RequestTags.
When a server changes it's tag we should rebuild the ACL rules.

When a server is added to headscale we also should update the ACLRules.
Rewrite some function to get rid of the dependency on Headscale object. This allows us
to write succinct test that are more easy to review and implement.

The improvements of the tests allowed to write the removal of the tagged hosts
from the namespace as specified here: https://tailscale.com/kb/1068/acl-tags/
This is a false positive on the way the function is built.
Small tests cases are all inside this functions, making it big.
@restanrm
Copy link
Contributor Author

I've rebased the modifications on latest main branch

Copy link
Collaborator

@kradalby kradalby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some starters, have not started on the code yet

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Show resolved Hide resolved
utils.go Show resolved Hide resolved
// Aclfilter peers here. We are itering through machines in all namespaces and search through the computed aclRules
// for match between rule SrcIPs and DstPorts. If the rule is a match we allow the machine to be viewable.

// FIXME: On official control plane if a rule allow user A to talk to user B but NO rule allows user B to talk to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a github issue

Copy link
Collaborator

@kradalby kradalby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking very promising!

I have left comments on the "input data" for tests for the ACLs, its quite hard to follow when using foo, bar, etc. Would be great if you could change that to descriptive, unique values and I will review that section again.

Otherwise, the expand* tests looks quite sensible and well covering, easy to extend!

There is a couple of other naming nits, but mostly good, thanks.

acls.go Outdated Show resolved Hide resolved
acls_test.go Outdated Show resolved Hide resolved
acls_test.go Outdated Show resolved Hide resolved
acls_test.go Outdated Show resolved Hide resolved
acls_test.go Outdated
// match properly the IP's of the related hosts. The owner is valid and the tag is also valid.
// the tag is matched in the Users section.
func (s *Suite) TestValidExpandTagOwnersInUsers(c *check.C) {
namespace, err := app.CreateNamespace("foo")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
namespace, err := app.CreateNamespace("foo")
namespace, err := app.CreateNamespace("validate-tagowner-expands-in-user")

Test values should be descriptive and independent, this makes it a lot easier to follow the tests as you read them without having to go up and down to check if foo was a hostname or bar was a tag, group, or user.

If they have description based on the test their a part of, e.g. tagValidateTagOwnerExpandsInUser for tag, it will also guarantee that the tag is independent and will not interfere with other parts of the tests (even if the test framework is suppose to sort this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed for more explicit names, username for namespaces, teams for groups and groups of servers for tags. It's better than before but not up to what you describe here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better thanks, when using the same values in adjacent tests, we don't make them independent of each other, but we can let this go for now and I can do a follow up PR and you have help me review it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With pleasure !

machine.go Outdated Show resolved Hide resolved
machine.go Outdated Show resolved Hide resolved
machine.go Outdated Show resolved Hide resolved
machine.go Outdated Show resolved Hide resolved
machine.go Outdated Show resolved Hide resolved
Using h.ListAllMachines also listed the current machine in the result. It's unnecessary (I don't know if it's harmful).

Breaking the check with the `matchSourceAndDestinationWithRule` broke the tests. We have a specificity with the '*' destination that isn't symetrical.
I need to think of a better way to do this. It too hard to read.
Copy link
Collaborator

@kradalby kradalby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple more nits and I think we can take this for a spin.

machine.go Outdated Show resolved Hide resolved
machine.go Outdated Show resolved Hide resolved
acls_test.go Show resolved Hide resolved
kradalby and others added 6 commits February 21, 2022 08:51
This should fix the performance issue with computation of `dst` variable. It's also easier to read now.
added a simple filter to remove the current node
The dependency to the `headscale` struct makes tests harder to do.

This change allow to easily add some tests for this quite sensible function.
@kradalby
Copy link
Collaborator

Could you run prettier lint?

And I see the integration test fail, but I can't really see why, and not sure if any of the changes should really affect it.

Could you have a dive?

Other than that I think I'm happy to get it in :)

@restanrm
Copy link
Contributor Author

Could you run prettier lint?

And I see the integration test fail, but I can't really see why, and not sure if any of the changes should really affect it.

Could you have a dive?

Other than that I think I'm happy to get it in :)

Linting is done. For the integrations tests, I see some mentions of ACL parsing in the logs and calls seems to go through getFilteredByACLPeers. I couldn't find where the ACL are read from. It can be a good explanation for the errors of the integrations tests. It's also possible that my code that check presence of ACLs is wrong. I'll try to dive into it tonight.

@kradalby
Copy link
Collaborator

It looks like there is nil pointer as a result to a call to the /map endpoint:

2022/02/21 17:17:03 [Recovery] 2022/02/21 - 17:17:03 panic recovered:
POST /machine/3efd0dd32dc939483bad4d7bf4a5ce2af31a1a0a294c7a4ece72a16918c06c17/map HTTP/1.1
Host: headscale:8080
Accept-Encoding: gzip
Content-Length: 641
User-Agent: Go-http-client/1.1


runtime error: invalid memory address or nil pointer dereference
/usr/local/go/src/runtime/panic.go:221 (0x44d686)
/usr/local/go/src/runtime/signal_unix.go:735 (0x44d656)
/go/src/headscale/acls.go:94 (0xcee90a)
/go/src/headscale/acls.go:76 (0xcee7c4)
/go/src/headscale/poll.go:98 (0xd0ce39)
/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:168 (0xb6ceb4)
/go/pkg/mod/github.com/zsais/[email protected]/middleware.go:364 (0xb6ce95)
/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:168 (0x946dc1)
/go/pkg/mod/github.com/gin-gonic/[email protected]/recovery.go:99 (0x946dac)
/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:168 (0x946026)
/go/pkg/mod/github.com/gin-gonic/[email protected]/logger.go:241 (0x946009)
/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:168 (0x945570)
/go/pkg/mod/github.com/gin-gonic/[email protected]/gin.go:555 (0x9451d8)
/go/pkg/mod/github.com/gin-gonic/[email protected]/gin.go:511 (0x944d11)
/usr/local/go/src/net/http/server.go:2879 (0x75457a)
/usr/local/go/src/net/http/server.go:1930 (0x74fc27)
/usr/local/go/src/runtime/asm_amd64.s:1581 (0x469240)

Ill see if I can come up with a suggestion before I have to go out again.

poll.go Outdated Show resolved Hide resolved
acls.go Show resolved Hide resolved
poll.go Outdated
Comment on lines 98 to 105
err = h.UpdateACLRules()
if err != nil {
log.Error().
Caller().
Str("func", "handleAuthKey").
Str("machine", machine.Name).
Err(err)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err = h.UpdateACLRules()
if err != nil {
log.Error().
Caller().
Str("func", "handleAuthKey").
Str("machine", machine.Name).
Err(err)
}
if h.aclPolicy != nil {
err = h.UpdateACLRules()
if err != nil {
log.Error().
Caller().
Str("func", "handleAuthKey").
Str("machine", machine.Name).
Err(err)
}
}

This should do it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But open to better solutions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opted for both adding error to the callee and avoiding error case on caller side. The callee still need to report and error since it can generate errors if no ACLPolicies are defined.

As for the placement of this piece of code, I'm also open to suggestion. I've put it here but don't know well enough the code to find better place. We need to update the ACL when the HostInfo of the machines are modified, and when hosts are registered.

@kradalby
Copy link
Collaborator

As a side note, we should probably have integration tests for ACL, but that can be done in the future, want to clean them up and remove all the shared stuff first

If aclPolicy is not defined, in updateAclPolicy, return an error.
@restanrm
Copy link
Contributor Author

As a side note, we should probably have integration tests for ACL, but that can be done in the future, want to clean them up and remove all the shared stuff first

Yes I wanted to do it but here, but the PR is big enough. We should create a issue for this subject as well.

@kradalby
Copy link
Collaborator

Good job, lets do it, I think I will release this as a beta straight away.

@kradalby
Copy link
Collaborator

Would you mind creating the issues mentioned in this PR @restanrm ?

@kradalby kradalby merged commit 69cdfbb into juanfont:main Feb 21, 2022
@restanrm
Copy link
Contributor Author

Would you mind creating the issues mentioned in this PR @restanrm ?

Yes, I'll create it now

@restanrm restanrm deleted the feat-improve-acls-usage branch February 22, 2022 08:24
@kradalby kradalby mentioned this pull request Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants