Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Deployment to custom vnet - Windows, Kubernetes #3705

Closed
GobinathPandurangan opened this issue Aug 20, 2018 · 6 comments
Closed

Deployment to custom vnet - Windows, Kubernetes #3705

GobinathPandurangan opened this issue Aug 20, 2018 · 6 comments
Assignees

Comments

@GobinathPandurangan
Copy link

Is this a request for help?:

Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):

Issue

What version of acs-engine?:

Version: v0.20.9
GitCommit: e0b6d2a
GitTreeState: clean

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Kubernetes

What happened:

Using the following json.

{
"apiVersion": "vlabs",
"properties": {
"orchestratorProfile": {
"orchestratorType": "Kubernetes",
"orchestratorRelease": "1.10",
"kubernetesConfig": {
"kubeletConfig": {
"--max-pods": "10"
}
}
},
"masterProfile": {
"count": 1,
"dnsPrefix": "acs-dns",
"vmSize": "Standard_D2_v2",
"vnetSubnetId": "/subscriptions/subid/resourceGroups/vnetname/providers/Microsoft.Network/virtualNetworks/vnetname/subnets/default",
"firstConsecutiveStaticIP": "10.8.170.170"
},
"agentPoolProfiles": [
{
"name": "windowspool",
"count": 2,
"vmSize": "Standard_D2_v2",
"availabilityProfile": "AvailabilitySet",
"osType": "Windows",
"vnetSubnetId": "/subscriptions/subid/resourceGroups/vnetname/providers/Microsoft.Network/virtualNetworks/vnetname/subnets/default"
}
],
"windowsProfile": {
"adminUsername": "server_admin",
"adminPassword": "pwd"
},
"linuxProfile": {
"adminUsername": "server_admin",
"ssh": {
"publicKeys": [
{
"keyData": "key"
}
]
}
},
"servicePrincipalProfile": {
"clientId": "id",
"secret": "secret"
}
}
}

In the generated output, added the subnet variable - due to "#1767"

While deploying got the following error,

msrest.http_logger : b'{"status":"Failed","error":{"code":"DeploymentFailed","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-debug for usage details.","details":[{"code":"Conflict"
,"message":"{\r\n \"status\": \"Failed\",\r\n \"error\": {\r\n \"code\": \"ResourceDeploymentFailure\",\r\n \"message\": \"The resource operation completed with terminal provisioning state 'Failed'.\",\r\n \"details\": [\r\n
{\r\n \"code\": \"VMExtensionProvisioningError\",\r\n \"message\": \"VM has reported a failure when processing extension 'cse-master-0'. Error message: \\\"Enable failed: failed to execute command: command terminated with exit status=50\\n
[stdout]\\n\\n[stderr]\\n\\\".\"\r\n }\r\n ]\r\n }\r\n}"}]}}'

Deployment failed. Correlation ID: d81f9272-a933-4118-9f3b-b5ff38bf0587. {
"status": "Failed",
"error": {
"code": "ResourceDeploymentFailure",
"message": "The resource operation completed with terminal provisioning state 'Failed'.",
"details": [
{
"code": "VMExtensionProvisioningError",
"message": "VM has reported a failure when processing extension 'cse-master-0'. Error message: "Enable failed: failed to execute command: command terminated with exit status=50\n[stdout]\n\n[stderr]\n"."
}
]
}
}

------ seeing the same error in the master - -/var/log/azure/custom-script/handler.log

  • /var/lib/waagent/Microsoft.Azure.Extensions.CustomScript-2.0.6/bin/custom-script-extension install
    time=2018-08-20T08:36:13Z version=v2.0.6/git@1008306-clean operation=install seq=0 event=start
    time=2018-08-20T08:36:13Z version=v2.0.6/git@1008306-clean operation=install seq=0 status="not reported for operation (by design)"
    time=2018-08-20T08:36:13Z version=v2.0.6/git@1008306-clean operation=install seq=0 event="migrate to mrseq" error="Can't find out seqnum from /var/lib/waagent/Microsoft.Azure.Extensions.CustomScript-2.0.6/status, not enough files."
    time=2018-08-20T08:36:13Z version=v2.0.6/git@1008306-clean operation=install seq=0 event="created data dir" path=/var/lib/waagent/custom-script
    time=2018-08-20T08:36:13Z version=v2.0.6/git@1008306-clean operation=install seq=0 event=installed
    time=2018-08-20T08:36:13Z version=v2.0.6/git@1008306-clean operation=install seq=0 status="not reported for operation (by design)"
    time=2018-08-20T08:36:13Z version=v2.0.6/git@1008306-clean operation=install seq=0 event=end
    Writing a placeholder status file indicating progress before forking: /var/lib/waagent/Microsoft.Azure.Extensions.CustomScript-2.0.6/status/0.status
  • nohup /var/lib/waagent/Microsoft.Azure.Extensions.CustomScript-2.0.6/bin/custom-script-extension enable
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event=start
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event=pre-check
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="comparing seqnum" path=mrseq
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="seqnum saved" path=mrseq
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="reading configuration"
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="read configuration"
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="validating json schema"
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="json schema valid"
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="parsing configuration json"
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="parsed configuration json"
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="validating configuration logically"
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="validated configuration"
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="creating output directory" path=/var/lib/waagent/custom-script/download/0
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="created output directory"
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 files=0
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="executing command" output=/var/lib/waagent/custom-script/download/0
    time=2018-08-20T08:36:14Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="executing protected commandToExecute" output=/var/lib/waagent/custom-script/download/0
    time=2018-08-20T08:38:52Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="failed to execute command" error="command terminated with exit status=50" output=/var/lib/waagent/custom-script/download/0
    time=2018-08-20T08:38:52Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="enable failed"
    time=2018-08-20T08:38:52Z version=v2.0.6/git@1008306-clean operation=enable seq=0 event="failed to handle" error="failed to execute command: command terminated with exit status=50"

checking - https://github.com/Azure/acs-engine/blob/master/parts/k8s/kubernetescustomscript.sh - testOutboundConnection() - looks like this is the one which throws the exit status 50.

  • from inside the master able to do "nc -v www.google.com 443" - succeeded.

  1. Even docker is not installed in the master...
  2. The images in the windows nodes are deployed properly and able to see the kubletwin/pause image.

What you expected to happen:

acs with kubernetes to be up and running.

How to reproduce it (as minimally and precisely as possible):

deploy acs with the json file provided above.

Anything else we need to know:

@CecileRobertMichon
Copy link
Contributor

Hi @GobinathPandurangan, please note that custom vnet with Windows support is not complete at this time and there are a few known issues (#1767, https://github.com/Azure/acs-engine/issues/3280).

That being said the error that you are seeing is happening on the Linux master so not related. This error likely 1) is a flake meaning that nc -v www.google.com 443 and nc -v www.1688.com 443 both failed 20 times in a row during deployment in which case you should try deploying again (the fact that you are able to do "nc -v www.google.com 443" successfully from the master makes me think this is the most likely. Or 2) VMs inside your vnet are unable to reach those two addresses, in which case deployment should be failing as outbound internet is one of the requirements for deploying an acs-engine k8s cluster.

@GobinathPandurangan
Copy link
Author

Thanks @CecileRobertMichon

I tried it 5 times over the past 2 days, with changing the configs a bit, but everytime ran into the same issue. After this issue, able to get into master and do both nc -v www.google.com 443 and nc -v www.1688.com 443 successfully.

Not sure if the line - from handler.log
event="migrate to mrseq" error="Can't find out seqnum from /var/lib/waagent/Microsoft.Azure.Extensions.CustomScript-2.0.6/status, not enough files."

is a problem. Could not find much details about it.

@CecileRobertMichon
Copy link
Contributor

@GobinathPandurangan
Copy link
Author

Thanks @CecileRobertMichon, you are right I see "nc -v 8.8.8.8 53" and "nc -v 8.8.4.4 53" which is failing inside. But I don't see this problem when I don't use the custom VNet.

@CecileRobertMichon
Copy link
Contributor

Can you check that your vnet doesn't have any NSG or firewall preventing outbound internet access?

@GobinathPandurangan
Copy link
Author

I don't see anything preventing it. Anyway, I'm abandoning the custom vNet for now as I have an alternate.
Thanks for your time. I would revisit it after 2-3 weeks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants