Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Azure bastion host #165

Closed
justaugustus opened this issue Mar 30, 2019 · 62 comments
Closed

Implement Azure bastion host #165

justaugustus opened this issue Mar 30, 2019 · 62 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@justaugustus
Copy link
Member

justaugustus commented Mar 30, 2019

/kind feature

Describe the solution you'd like
Adding a bastion node will allow secure access to nodes, without having to rely on NAT rules on the public load balancer, laying the groundwork for non-public capz scenarios.

Use the capa bastion as a reference point: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/master/pkg/cloud/aws/services/ec2/bastion.go

https://docs.microsoft.com/en-us/azure/bastion/bastion-overview

Related: #104

/priority important-soon
/milestone v1alpha

@k8s-ci-robot
Copy link
Contributor

@justaugustus: The provided milestone is not valid for this repository. Milestones in this repository: [baseline, mvp, next, v1alpha1]

Use /milestone clear to clear the milestone.

In response to this:

/kind feature

Describe the solution you'd like
Adding a bastion node will allow secure access to nodes, without having to rely on NAT rules on the public load balancer, laying the groundwork for non-public capz scenarios.

Related: #104

/priority important soon
/milestone v1alpha

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 30, 2019
@justaugustus
Copy link
Member Author

/priority important-soon
/milestone v1alpha1

@k8s-ci-robot k8s-ci-robot added this to the v1alpha1 milestone Mar 30, 2019
@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Mar 30, 2019
@justaugustus
Copy link
Member Author

/help

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Mar 30, 2019
@tahsinrahman
Copy link
Contributor

I can work on this. How should it look like?

What i can think of now is,

  • A bastion host with public ip
  • A new subnet for bastion host, with security group rule that allows all traffice at port 22
  • Remove ssh rule from controlplane sg
  • Remove ssh nat rule from load balancer

@justaugustus
Copy link
Member Author

@tahsinrahman -- Appreciate the help! What you've described looks like a great first pass. Additionally, let's make sure that the bastion uses the same OS as the Cluster API machines (Ubuntu 18.04).

Tag me on the PR when you're ready for someone to review. :)

@justaugustus
Copy link
Member Author

/remove-help

@k8s-ci-robot k8s-ci-robot removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Apr 3, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 2, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 1, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@justaugustus
Copy link
Member Author

/reopen
/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot reopened this Oct 2, 2019
@k8s-ci-robot
Copy link
Contributor

@justaugustus: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 2, 2019
@justaugustus justaugustus modified the milestones: v0.2, v0.3 Oct 2, 2019
@whites11
Copy link
Contributor

/close

Fixed via #1300

Wait. This issue discussed 3 strategies for implementing bastion access to workload clusters. I implemented only one in #1300. I'm working on the second one, should we keep tracking it in this issue?

@devigned
Copy link
Contributor

/reopen

@whites11, you are right. I was a little too quick to close.

@k8s-ci-robot
Copy link
Contributor

@devigned: Reopened this issue.

In response to this:

/reopen

@whites11, you are right. I was a little too quick to close.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Jun 17, 2021
@whites11
Copy link
Contributor

I am working on the second strategy, the one that uses virtual machines as bastion hosts to access nodes.
The approved CRD structure is as follows:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AzureCluster
metadata:
  name: d3m3v
  namespace: default
spec:
  bastionSpec:
    azureMachine:
      machineRef: *ObjectReference
      publicIP:
        name: "azurebastion1-publicIP"
...

but now that I am actually working on it I believe my proposal was wrong and incomplete.
I will try to explain the conclusions I came up with while trying to implement this, an updated CR example and some details here and there in order to collect feedback and ideas.

GOAL: goal is to have one (or more) virtual machine(s) acting as a bastion to ssh/rdp into kubernetes nodes rather than using one of the masters or the AzureBastion feature.

To adhere to the KISS principle my idea is to reuse existing reconciliation controllers for that.
In practice that means creating 3 CRs:

  1. A Secret holding the cloud-init config file to bootstrap the bastion VM with the needed config (has to be provided by user and can't be defaulted any way)
  2. An AzureMachineTemplate for each bastion that describes how the VM should be (size, OS, disks, etc). Can be defaulted.
  3. A MachineDeployment to bind the previous 2 CRs together and make CAPI/CAPZ controllers reconcile and create the VMs. Can be defaulted.

From a UX point of view, we only need to provide a way to define the Secret (point 1 above) and optionally the AzureMachineTemplate.
The CR could be something like:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AzureCluster
metadata:
  name: d3m3v
  namespace: default
spec:
  bastionSpec:
    azureMachine:
      bootstrapDataSecretName: "<secret name>"
      azureMachineTemplates: []
      replicas: 2
...

With the above information, the azure cluster controller can create 2 azure machine templates and 2 machine deployments and then the creation of VMs should happen automagically.

Why a slice of azuremachinetemplates you might ask? Because having multiple bastion hosts only makes sense if those VMs are placed in different availability zones. Since the failureDomain field is a scalar in the AzureMachineTemplate CR, I need to have different templates for different availability zones.

WDYT in general about this idea?

@devigned
Copy link
Contributor

+1 on this thus far. However, I think there are a couple things missing that I would like to see.

  1. What are the security rules associated to ingress for the bastion public IP? How will someone set these ingress rules? Is that public IP going to be in its own network security group or the cluster's NSG? I would like to better understand the Azure network infrastructure resources and how access will be constrained.
  2. I think it would be super useful to be able to provide SSH public keys for the deployment, so users can easily provide ssh access to the bastion.

Might be a good idea to create a proposal for this work.

@whites11
Copy link
Contributor

whites11 commented Jul 1, 2021

Thanks a lot for your input @devigned

+1 on this thus far. However, I think there are a couple things missing that I would like to see.

  1. What are the security rules associated to ingress for the bastion public IP? How will someone set these ingress rules? Is that public IP going to be in its own network security group or the cluster's NSG? I would like to better understand the Azure network infrastructure resources and how access will be constrained.

Absolutely agree. Network configurability is key and I will think about it in the CAEP.

  1. I think it would be super useful to be able to provide SSH public keys for the deployment, so users can easily provide ssh access to the bastion.

I am not sure I agree with this. On one end it would make things easier for what we could consider the most common use case but on the other hand we would make it harder for more complex ones. But let's discuss this in the CAEP I guess.

Might be a good idea to create a proposal for this work.

Will begin working on this ASAP.
BTW I noticed my idea so far is provider independent and that's awesome IMHO.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 29, 2021
@whites11
Copy link
Contributor

I am not working on this any more, will unassign me.

@whites11 whites11 removed their assignment Sep 29, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 29, 2021
@devigned
Copy link
Contributor

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 29, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2022
@shysank
Copy link
Contributor

shysank commented Jan 27, 2022

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2022
@jackfrancis
Copy link
Contributor

@CecileRobertMichon what is the remaining work to be done for this issue?

@CecileRobertMichon
Copy link
Contributor

@jackfrancis good question. I believe the original intent was to support the option of having a VM as bastion, in addition to the AzureBastion service. Given that the AzureBastion service is more mature now and has recently added CLI support as well, I don't know of any use case for needing a standalone VM as bastion over using the AzureBastion service, in which case we can close this issue since we already support AzureBastion.

@jackfrancis
Copy link
Contributor

Will close for now, anyone tracking this who disagrees please re-open w/ a statement of desired net new work!

@CecileRobertMichon CecileRobertMichon removed this from the next milestone May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.