Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs/dev/libvirt: Add troubleshooting docs for libvirt console issue. #1371

Merged

Conversation

praveenkumar
Copy link
Contributor

@openshift-ci-robot
Copy link
Contributor

Hi @praveenkumar. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 6, 2019
@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 6, 2019
@@ -286,6 +286,32 @@ kubectl get --all-namespaces pods
## Troubleshooting
If following the above steps hasn't quite worked, please review this section for well known issues.

### console doesn't come up
In case of libvirt there is no wildcard dns resolution and console is depend on the route which is created by auth operator.
To make it work we need to use the multi-stage installation and edit the `domain` for ingress.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: No newline for starting new sentences please. Either start a new paragraph or keep the sentence on the same line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeenix That what did for minishift also, making every sentence to a new line make it easier to read and navigate. I don't see any issue it here also.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's true, let's be consistent and start every sentence on a new line? :) This is plain English text so English grammar rules should be followed here as much as they make sense. You break the line based on max characters per line (whatever is decided per project/file) on a word boundary.

As for minishift doing the same, yes I know it has become common practice recently but mostly cause people don't care, not because it helps anything. :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting every sentence on a newline (in a system where newlines on their own don’t matter, as is the case here) is also nice for change control: it means that diffs don’t pull in the end of the previous sentence or the start of the next one, and changing one sentence’s length doesn’t cause the rest of the paragraph to reflow and show up in the diff...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would apply equally to starting every word on new line, as you often need to fix, add or remove words. :) I hope one thing we can all agree on, is consistency. If you want to start every sentence on a new line, maybe we can talk. Otherwise please let's stick to English grammar rules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree we need consistency! IME starting sentences on a new line (and word-wrapping them too) is a nice balance between source readability and long-term maintainability. It’s weird if the source text is also what the user sees, directly, but since most docs nowadays are post-processed and end up as HTML or something like that where newlines are ignored, it’s OK in practice.

// Create the manifests
$ openshift-install create manifests --dir myTestDir

// Edit the cluster-ingress-02-config.yml file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the following command is not editing the file so comment is a bit confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeenix Edited.

@@ -286,6 +286,32 @@ kubectl get --all-namespaces pods
## Troubleshooting
If following the above steps hasn't quite worked, please review this section for well known issues.

### console doesn't come up
In case of libvirt there is no wildcard dns resolution and console is depend on the route which is created by auth operator.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest: s/is depend on/depends on/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -286,6 +286,32 @@ kubectl get --all-namespaces pods
## Troubleshooting
If following the above steps hasn't quite worked, please review this section for well known issues.

### console doesn't come up
In case of libvirt there is no wildcard dns resolution and console is depend on the route which is created by auth operator.
To make it work we need to use the multi-stage installation and edit the `domain` for ingress.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest that you avoid 'the multi-stage installation' since that could be understood as indicating that the install command itself is to be run in stages where you run part of the install, do something, and then run the rest of the install.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I had the same comment. multi-stage would likely refer here to the way the installer works... but this is actually abstracted away for most people, as they likely will experiment first with just create cluster. I would suggest to be more explicit and mention the order of the commands... or else mention how this could work AFTER the installation has been performed (if possible).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some rewording, hope that make it bit clear.

domain: apps.tt.testing

// Start the installer to create the cluster
$ openshift-install create cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--dir myTestDir missing here?

Copy link

@gbraad gbraad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inconsistent use of --dir and needs clearer instructions around 'multi-stage'

@abhinavdahiya
Copy link
Contributor

@@ -286,6 +286,32 @@ kubectl get --all-namespaces pods
## Troubleshooting
If following the above steps hasn't quite worked, please review this section for well known issues.

### console doesn't come up
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Console

$ sudo systemctl restart NetworkManager

// Create the manifests
$ openshift-install create manifests --dir myTestDir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openshift-installer --dir $INSTALL_DIR create manifests

// Here `tt.testing` is the domain which I choose when running the installer.
$ cat /etc/NetworkManager/dnsmasq.d/openshift.conf
server=/tt.testing/192.168.126.1
address=/.apps.tt.testing/192.168.126.51
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only work for a single worker, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gbraad yes and this is better than nothing atm.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will it .apps.tt.testing or .apps.test.tt.testing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anildhingra no it is apps.tt.testing not the domain which your cluster use.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it still fails

2019/03/11 13:33:10 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://openshift-authentication-openshift-authentication.apps.test.tt.testing/oauth/token failed: Head https://openshift-authentication-openshift-authentication.apps.test.tt.testing: dial tcp: lookup openshift-authentication-openshift-authentication.apps.test.tt.testing on 172.30.0.10:53: no such host
2019/03/11 13:33:20 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://openshift-authentication-openshift-authentication.apps.test.tt.testing/oauth/token failed: Head https://openshift-authentication-openshift-authentication.apps.test.tt.testing: dial tcp: lookup openshift-authentication-openshift-authentication.apps.test.tt.testing on 172.30.0.10:53: no such host

if i do dummy nslookup it works

[root@test-wm76w-master-0 ~]# nslookup one.apps.test.tt.testing
Server: 192.168.126.1
Address: 192.168.126.1#53

but in above case its looking - openshift-authentication-openshift-authentication.apps.test.tt.testing

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@praveenkumar,

Could not this be done automatically when libvirt is selected?

Extend the functionality of create.go

Congratulations also for the solution !!

Best Regards,
Fábio Sbano

Copy link

@gbraad gbraad Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a temporary solution. In future we do not want to have to do so many modifications. Since a solution that is cross-platform is preferred, we might not continue with using dnsmasq, but rather coredns for example.

Copy link

@ghost ghost Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gbraad,

I solved with bind on physical machine without having to change the (route). #1397

Installer exited by timeout (30min) and I found problem too #1394

(I do not know if it was due to this workaround)

Best Regards,
Fábio Sbano

Copy link

@anjannath anjannath Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@praveenkumar This worked for me, but please explicitly mention the the IP address is of the worker node

Copy link
Contributor

@steveej steveej Apr 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: nevermind the following. I forgot to run the create manifests command.


I can't find the manifest to edit in my cluster's installation directory:

steveej@steveej-t480s-work ✓ ~/src/job-redhat/GOPATH/src/github.com/openshift/installer/openshift-steveej  954533fcc ⚡
$ tree                                      
.
├── auth
│   ├── kubeadmin-password
│   └── kubeconfig
├── disable-bootstrap.tfvars
├── metadata.json
├── terraform.tfstate
└── tls
    ├── journal-gatewayd.crt
    └── journal-gatewayd.key

How do I get the installer to expose them?

@praveenkumar praveenkumar force-pushed the libvirt_console_issue branch from cc4a3fd to 20d54ad Compare March 7, 2019 05:49
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 7, 2019
@praveenkumar
Copy link
Contributor Author

@abhinavdahiya @wking do have another look, made requested changes.

@praveenkumar praveenkumar force-pushed the libvirt_console_issue branch from 20d54ad to cf0fb11 Compare March 7, 2019 05:55
@praveenkumar praveenkumar changed the title Add troubleshooting docs for libvirt console issue. docs/dev/libvirt-howto.md: Add troubleshooting docs for libvirt console issue. Mar 8, 2019
@praveenkumar
Copy link
Contributor Author

/ok-to-test

@openshift-ci-robot openshift-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 20, 2019
@praveenkumar
Copy link
Contributor Author

/test e2e-libvirt

2 similar comments
@praveenkumar
Copy link
Contributor Author

/test e2e-libvirt

@praveenkumar
Copy link
Contributor Author

/test e2e-libvirt

@zeenix
Copy link
Contributor

zeenix commented Apr 11, 2019

@praveenkumar Are e2e tests useful for this change? Let's not block on them unless that's the case.

@praveenkumar
Copy link
Contributor Author

/test e2e-libvirt

2 similar comments
@praveenkumar
Copy link
Contributor Author

/test e2e-libvirt

@praveenkumar
Copy link
Contributor Author

/test e2e-libvirt

@steven-ellis
Copy link

Next step is to also add a reference to the default storage pool from #308

@zeenix
Copy link
Contributor

zeenix commented Apr 15, 2019

@abhinavdahiya I think all review comments have been addressed so let's give the missing label to get it merged?

@wking
Copy link
Member

wking commented Apr 15, 2019

I think we want #1587, which will let us offload libvirt maintainership to folks with more time for it. But these two PRs could land in either order.

@praveenkumar praveenkumar force-pushed the libvirt_console_issue branch 2 times, most recently from a0137cf to d6e1f42 Compare April 23, 2019 11:39
@praveenkumar
Copy link
Contributor Author

/test e2e-libvirt

@zeenix
Copy link
Contributor

zeenix commented Apr 24, 2019

@wking #1587 has been merged already and this PR has been rebased and it passes all tests now (although e2e tests didn't apply to this PR). Could we merge this?

@wking wking changed the title docs/dev/libvirt-howto.md: Add troubleshooting docs for libvirt console issue. docs/dev/libvirt: Add troubleshooting docs for libvirt console issue. Apr 24, 2019
Copy link
Member

@wking wking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been too long since I've played with a libvirt console, so I can't really review the steps. In the spirit of the pending #1662 I'm fine assuming that's all right ;). But style review is easier, so I've left a few nit suggestions inline.

@@ -286,6 +286,38 @@ kubectl get --all-namespaces pods
## Troubleshooting
If following the above steps hasn't quite worked, please review this section for well known issues.

### Console doesn't come up
In case of libvirt there is no wildcard dns resolution and console depends on the route which is created by auth operator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "dns" -> "DNS"

@@ -286,6 +286,38 @@ kubectl get --all-namespaces pods
## Troubleshooting
If following the above steps hasn't quite worked, please review this section for well known issues.

### Console doesn't come up
In case of libvirt there is no wildcard dns resolution and console depends on the route which is created by auth operator.
To make it works we need to first create the manifests and edit the `domain` for ingress config, before directly creating the cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "works" -> "work". Alternatively, drop the leader and compact the sentence to "We need to edit the ingress domain before creeating the cluster." or some such.

In case of libvirt there is no wildcard dns resolution and console depends on the route which is created by auth operator.
To make it works we need to first create the manifests and edit the `domain` for ingress config, before directly creating the cluster.

Following steps are required to have console operator up and working.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the previous two sentences cover this. If you want to make it easier to blindly find this documentation, maybe include an example of the error message that eventually bubbles out if you don't take these steps?


Following steps are required to have console operator up and working.
```console
// Step-1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: // are not shell comments. Either use $ # Step-1, etc., or (my preference), shift this supporting text outside of the fenced code block:

Add another domain entry in `openshift.conf`, which is...

```console
$ cat /etc/NetworkManager/dnsmasq.d/openshift.conf
server=/tt.testing/192.168.126.1
address=/.apps.tt.testing/192.168.126.51
```

Make sure you restart ...


// Step-4
// Domain entry for cluster-ingress-02-config.yml file should be following (here domain is what your created in Step-1)
$ cat $INSTALL_DIR/manifests/cluster-ingress-02-config.yml | grep domain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to cat, just grep domain $INSTALL_DIR/manifests/cluster-ingress-02-config.yml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I suggest turning it into:

sed -i 's/test1.//g' mycluster/manifests/cluster-ingress-02-config.yml

Which is what I've been using and it makes things easy for people.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeenix this will only work if you provided test1 as your domain name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@praveenkumar Right, it should be:

sed -i 's/YOUR_DOMAIN_NAME.//g' mycluster/manifests/cluster-ingress-02-config.yml

then. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@praveenkumar could you please update this or let me know if you disagree?

@praveenkumar praveenkumar force-pushed the libvirt_console_issue branch from d6e1f42 to 1abc94f Compare April 25, 2019 05:44
@praveenkumar
Copy link
Contributor Author

/test e2e-libvirt

@praveenkumar
Copy link
Contributor Author

/retest


- Domain entry for cluster-ingress-02-config.yml file should be following (here domain is what your created initially)
```console
$ grep domain $INSTALL_DIR/manifests/cluster-ingress-02-config.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@praveenkumar Wouldn't you agree that my sed suggestion would better since folks can just directly copy&paste it?

@@ -286,6 +286,40 @@ kubectl get --all-namespaces pods
## Troubleshooting
If following the above steps hasn't quite worked, please review this section for well known issues.

### Console doesn't come up
In case of libvirt there is no wildcard DNS resolution and console depends on the route which is created by auth operator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I asked for this before but maybe my comment didn't make it through: Could we please refer to the issue we're working around in here and in the commit log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeenix done.

@praveenkumar
Copy link
Contributor Author

/test e2e-libvirt

…ole issue.

Currently cluster created by libvirt not able to resolve the auth route
and because of that console doesn't comeup. This troubleshooting doc entry
direct users to make some modification before running the cluster so that
auth route can be resolved by the cluster. Fix openshift#1007
@praveenkumar praveenkumar force-pushed the libvirt_console_issue branch from 1abc94f to 90b0d45 Compare May 6, 2019 10:08
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 6, 2019
@zeenix
Copy link
Contributor

zeenix commented May 6, 2019

/test e2e-libvirt

@zeenix
Copy link
Contributor

zeenix commented May 16, 2019

/test e2e-aws

@zeenix
Copy link
Contributor

zeenix commented May 22, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 22, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: praveenkumar, zeenix

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 03b753f into openshift:master May 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. platform/libvirt size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.