-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
machines/libvirt/worker.machineset.yaml: Drop /var/lib/libvirt/images #70
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: wking If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The actuator looks up the baseVolumeID in the configured pool [1]. That lookup works with both the volume name and full volume path: $ virsh -c qemu:///system vol-info --vol coreos_base --pool default Name: coreos_base Type: file Capacity: 16.00 GiB Allocation: 1.55 GiB $ virsh -c qemu:///system vol-info --vol /home/trking/VirtualMachines/coreos_base --pool default Name: coreos_base Type: file Capacity: 16.00 GiB Allocation: 1.55 GiB But it fails if you use the wrong full path: $ virsh -c qemu:///system vol-info --vol /var/lib/libvirt/images/coreos_base --pool default error: failed to get vol '/var/lib/libvirt/images/coreos_base' error: Storage volume not found: no storage vol with matching path '/var/lib/libvirt/images/coreos_base' My default pool happens to be in my home directory: $ virsh -c qemu:///system pool-dumpxml default <pool type='dir'> <name>default</name> <uuid>c20a2154-aa60-44cf-bf37-cd8b7818a4e4</uuid> <capacity unit='bytes'>105554829312</capacity> <allocation unit='bytes'>44134699008</allocation> <available unit='bytes'>61420130304</available> <source> </source> <target> <path>/home/trking/VirtualMachines</path> <permissions> <mode>0777</mode> <owner>114032</owner> <group>114032</group> <label>system_u:object_r:virt_image_t:s0</label> </permissions> </target> </pool> This commit allows configutions like mine by dropping our opinions about the default pool location and just using the volume names: $ virsh -c qemu:///system vol-list --pool default Name Path ------------------------------------------------------------------------------ bootstrap /home/trking/VirtualMachines/bootstrap bootstrap.ign /home/trking/VirtualMachines/bootstrap.ign coreos_base /home/trking/VirtualMachines/coreos_base master-0.ign /home/trking/VirtualMachines/master-0.ign master0 /home/trking/VirtualMachines/master0 worker.ign /home/trking/VirtualMachines/worker.ign We've been using the full-path approach since the templates landed in 2522d0f (add libvirt support, 2018-08-30, openshift#35), but there's no discussion there about why the path approach was chosen instead of the name approach I'm switching to here. Longer-term, it would be nice to pull both the pool and volume names from information pushed by the installer [2]. But I'm punting on *that* for this commit. Reported-by: Matt Rogers <[email protected]> [1]: https://github.com/openshift/cluster-api-provider-libvirt/blob/2e5a516afc704c6c94d7b7cde74e78c43bbfeaa5/cloud/libvirt/actuators/machine/utils/volume.go#L174 [2]: https://github.com/openshift/installer/blob/dc4764dc603cea5da0e54f575b7ae1a2c26d3102/pkg/types/machinepools.go#L53-L58
1934183
to
ea4a450
Compare
@wking thanks a lot! lgtm @vikaschoudhary16 / @bison can you please have a look and merge at your convenience? |
@wking thanks for fixing this! |
Thoughts on how I'd test this? I built an image from this PR using the $ git diff
diff --git a/modules/bootkube/resources/manifests/machine-api-operator.yaml b/modules/bootkube/resources/manifests/machine-api-operator.yaml
index 125b870..5c72b5a 100644
--- a/modules/bootkube/resources/manifests/machine-api-operator.yaml
+++ b/modules/bootkube/resources/manifests/machine-api-operator.yaml
@@ -19,7 +19,7 @@ spec:
spec:
containers:
- name: machine-api-operator
- image: quay.io/coreos/machine-api-operator:b6a04c2
+ image: quay.io/wking/machine-api-operator:pr-70
command:
- "/machine-api-operator"
resources:
diff --git a/pkg/asset/manifests/content/bootkube/machine-api-operator.go b/pkg/asset/manifests/content/bootkube/machine-api-operator.go
index 48e4765..fd78535 100644
--- a/pkg/asset/manifests/content/bootkube/machine-api-operator.go
+++ b/pkg/asset/manifests/content/bootkube/machine-api-operator.go
@@ -24,7 +24,7 @@ spec:
spec:
containers:
- name: machine-api-operator
- image: quay.io/coreos/machine-api-operator:b6a04c2
+ image: quay.io/wking/machine-api-operator:pr-70
command:
- "/machine-api-operator"
resources: But when I try to launch a cluster, the machine-api-operator pod dies complaining about the invocation: [core@trking-6d200-master-0 ~]$ sudo crictl logs 42396a6f9bd9b
Run Cluster API Controller
Usage:
machine-api-operator [command]
Available Commands:
help Help about any command
start Starts Machine API Operator
version Print the version number of Machine API Operator
Flags:
--alsologtostderr log to standard error as well as files
--config string path to the mao config (default "/etc/mao-config/config")
-h, --help help for machine-api-operator
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--logtostderr log to standard error instead of files
--stderrthreshold severity logs at or above this threshold go to stderr (default 2)
-v, --v Level log level for V logs
--vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
Use "machine-api-operator [command] --help" for more information about a command. Is there a different |
Looks like I need to add |
Ok, I'm farther along with: diff --git a/modules/bootkube/resources/manifests/machine-api-operator.yaml b/modules/bootkube/resources/manifests/machine-api-operator.yaml
index 125b870..a11046e 100644
--- a/modules/bootkube/resources/manifests/machine-api-operator.yaml
+++ b/modules/bootkube/resources/manifests/machine-api-operator.yaml
@@ -19,9 +19,11 @@ spec:
spec:
containers:
- name: machine-api-operator
- image: quay.io/coreos/machine-api-operator:b6a04c2
+ image: quay.io/wking/machine-api-operator:pr-70
command:
- "/machine-api-operator"
+ args:
+ - "start"
resources:
limits:
cpu: 20m
@@ -51,4 +53,3 @@ spec:
items:
- key: mao-config
path: config
-
diff --git a/pkg/asset/manifests/content/bootkube/machine-api-operator.go b/pkg/asset/manifests/content/bootkube/machine-api-operator.go
index 48e4765..e4e4156 100644
--- a/pkg/asset/manifests/content/bootkube/machine-api-operator.go
+++ b/pkg/asset/manifests/content/bootkube/machine-api-operator.go
@@ -24,9 +24,11 @@ spec:
spec:
containers:
- name: machine-api-operator
- image: quay.io/coreos/machine-api-operator:b6a04c2
+ image: quay.io/wking/machine-api-operator:pr-70
command:
- "/machine-api-operator"
+ args:
+ - "start"
resources:
limits:
cpu: 20m Now I'm hitting: [core@trking-359a0-master-0 ~]$ sudo crictl logs --tail 2 8a523d18e0014
E0924 23:01:50.276908 1 leaderelection.go:228] error initially creating leader election record: namespaces "openshift-machine-api-operator" not found
E0924 23:02:25.979902 1 leaderelection.go:228] error initially creating leader election record: namespaces "openshift-machine-api-operator" not found |
hey @wking thanks for looking into it, your steps are right. We just rewrote the implementation logic to drop appVersion and x-operator. This is all it needs to be satisfied by the installer https://github.com/openshift/machine-api-operator#manual-deployment-for-kubernetes-cluster so openshift-machine-api-operator would need to be precreated as well. See also https://github.com/openshift/machine-api-operator/tree/master/tests/e2e/manifests |
It sounds like we're comfortable with the changes I'm proposing here, and the only remaining issues are getting the installer bumped to adjust to other changes that have landed since the last time the installer bumped it's pinned operator version. Is there anything blocking this PR, or anything I can do to help get it landed? |
Hey @wking sorry for the dealy for this to work we'll also need to update how we create the volume for each domain https://github.com/openshift/cluster-api-provider-libvirt/blob/2e5a516afc704c6c94d7b7cde74e78c43bbfeaa5/cloud/libvirt/actuators/machine/utils/domain.go#L23 |
/retest |
@wking: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Closing this as it's only relevant to https://github.com/openshift/cluster-api-provider-libvirt repo now |
Makefile: add Go debug support for local binaries
The actuator looks up the
baseVolumeID
in the configured pool. That lookup works with both the volume name and full volume path:But it fails if you use the wrong full path:
My default pool happens to be in my home directory:
This commit allows configutions like mine by dropping our opinions about the default pool location and just using the volume names:
Longer-term, it would be nice to pull both the pool and volume names from information pushed by the installer. But I'm punting on that for this commit.
Reported by @mrogers950.