-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Containerized installs on RHEL were downgrading docker unnecessarily #1718
Conversation
@@ -10,7 +10,12 @@ | |||
- name: Downgrade docker if necessary | |||
command: "{{ ansible_pkg_mgr }} downgrade -y docker-{{ docker_version }}" | |||
register: docker_downgrade_result | |||
when: not docker_version_result | skipped and docker_version_result | default('0.0', True) | version_compare(docker_version, 'gt') | |||
when: not docker_version_result | skipped and docker_version_result.stdout | default('0.0', True) | version_compare(docker_version, 'gt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdodson, I think not using .stdout
was just a bug previously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
aos-ci-test |
|
||
- set_fact: | ||
l_common_version: "{{ common_version.stdout | default('0.0', True) }}" | ||
when: not openshift.common.is_containerized | bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably have an irrational hate of set_fact... I can't think of a way to put this into a vars file though, with the use of common_version.stdout though...
it could be limited down to one set_fact if you do the following though:
l_common_version: "{{ ( openshift.common.image_tag | oo_image_tag_to_rpm_version if openshift.common.is_containerized | bool else common_version.stdout) | default('0.0', True) }}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd do whatever you prefer. I sort of hate inlining if/else like this. It almost always gets more complicated and I can no longer count how many bugs I've seen because of followup edits that subtly break things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine, if it can't be eliminated by pushing it to vars, it's not really worth it anyway.
e62ae2a
to
cc49a45
Compare
"Kind": "DockerImage", | ||
"Name": "registry.access.redhat.com/jboss-eap-7-beta/eap70-openshift:1.3" | ||
"kind": "DockerImage", | ||
"name": "registry.access.redhat.com/jboss-eap-7-beta/eap70-openshift:1.3" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdodson, I think the reason these passed testing is because they ran against 3.0 and 3.1. It looks like for 3.2 it breaks validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crap, yeah. filed an issue with upstream of us so we don't inadvertantly re-introduce this jboss-openshift/application-templates#144
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no worries, I figured I'd bundle it in with some PR like this one.
aos-ci-test |
aos-ci-test |
action: "{{ ansible_pkg_mgr }} name=docker{{ '-' + docker_version if docker_version != '' else '' }} state=present" | ||
when: not openshift.common.is_atomic | bool and not docker_downgrade_result | changed | ||
action: "{{ ansible_pkg_mgr }} name=docker{{ '-' + docker_version if docker_version_result.stdout == '' }} state=latest" | ||
when: not openshift.common.is_atomic | bool and not docker_version_result | skipped and docker_version_result.stdout | default('0.0', True) | version_compare(docker_version, 'lt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@detiber, @sdodson, this is what I'm testing. It's weird but it seems to handle the 1) no docker installed 2) downgrade and 3) upgrade cases. The idea is that if docker_version_result.stdout is empty we need to install the minimum required version. If there is a version installed and we didn't need to downgrade then we can ensure the latest version.
I think there's still an edgecase if a version like 1.8.1 was installed and we needed 1.8.2. I think it would actually result in 1.9.1 being installed. This is a mess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did it not work without changing from if docker_version != ''
to docker_version_result.stdout == ''
As you have it we'd always need to set docker_version, which would be the case if we changed from > 3.2/1.2 to >= 3.2/1.2 in the facts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, I see you're accounting for that in the when condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually surprised there isn't an error for line 16 and not having a else clause... I suppose that is just a jinja2'ism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I really think we should avoid latest to avoid upgrading docker inadvertently during regular CM operations, since we should only have to worry about initial installation, and 1.2/3.2 packages would have a requirement on docker-1.9, that should only leave us with issues related to containerized installs and docker versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with reworking this to avoid using 'latest'. I think one of the problems I was hitting was coming up with some simple logic without reworking what we already had too much. I think if we don't have 'latest' the way the logic is now it could result in docker-1.9.1 being explicitly installed even though a 1.9.2 package was available. That isn't the end of the world but it's a spell of overly complicated logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've posted this[0] question regarding how we should express requirements of the host os. If we're not worried about containerized installs this is a non issue because installing 3.2 atomic-openshift-node package will ensure that we're at a minimum version so it's really the upper version limits for 3.1 and 3.0 that we require.
0 - http://ask.projectatomic.io/en/question/3641/expressing-requirements-of-the-host-os/
- name: Remove deferred deletion for downgrades from 1.9 | ||
command: > | ||
sed -i 's/--storage-opt dm.use_deferred_deletion=true//' /etc/sysconfig/docker-storage | ||
when: docker_version_result.stdout | default('0.0', True) | version_compare('1.9', '>=') and docker_version | version_compare('1.9', '<') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdodson, what do you think about this?
aos-ci-test |
- name: Remove deferred deletion for downgrades from 1.9 | ||
command: > | ||
sed -i 's/--storage-opt dm.use_deferred_deletion=true//' /etc/sysconfig/docker-storage | ||
when: not docker_version_result | skipped and docker_version_result.stdout | default('0.0', True) | version_compare('1.9', '>=') and docker_version | version_compare('1.9', '<') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdodson, I just made another update to add not docker_version_result | skipped
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's just github but it looks like you didn't remove the old version of this from below? We also need the check for the file existence moved above this, i'd drop its when clause all together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the file may not exist if they haven't started docker before.
- stat: path=/etc/sysconfig/docker
register: docker_check
aos-ci-test |
aos-ci-test |
lgtm |
action: "{{ ansible_pkg_mgr }} name=docker{{ '-' + docker_version if docker_version != '' else '' }} state=present" | ||
when: not openshift.common.is_atomic | bool and not docker_downgrade_result | changed | ||
action: "{{ ansible_pkg_mgr }} name=docker{{ '-' + docker_version if docker_version_result.stdout == '' }} state=present" | ||
when: not openshift.common.is_atomic | bool and not docker_version_result | skipped and docker_version_result.stdout | default('0.0', True) | version_compare(docker_version, 'lt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This showed that all the variables were actually defined as expected:
- debug: var=openshift
- debug: var=docker_version_result
- debug: var=docker_version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brenton maybe stdout is not in docker_version_result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line above it is missing an else
{{ '-' + docker_version if docker_version_result.stdout == '' else '' }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the logs I have it is:
2016-04-08 16:51:21,001 p=924 u=root | ok: [ip-172-18-0-28.ec2.internal] => {
"var": {
"docker_version_result": {
"changed": false,
"cmd": [
"repoquery",
"--installed",
"--qf",
"%{version}",
"docker"
],
** SNIP**
"start": "2016-04-08 16:51:19.957724",
"stderr": "",
"stdout": "1.8.2",
"stdout_lines": [
"1.8.2"
],
"warnings": []
}
}
}
2016-04-08 16:51:21,003 p=924 u=root | ok: [ip-172-18-15-36.ec2.internal] => {
"var": {
"docker_version_result": {
"changed": false,
**SNIP**
"rc": 0,
"start": "2016-04-08 16:51:20.019622",
"stderr": "",
"stdout": "1.8.2",
"stdout_lines": [
"1.8.2"
],
"warnings": []
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abutcher, yeah, that was it. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@detiber, this is pretty hard to read but here's the thinking:
If docker_version_result.stdout == ''
that means docker was never installed. Therefore we have to rely on the docker_version and result in something like yum install docker-1.8.2
. If there is stdout that means it's installed. We've already previously downgraded to the correct version in the previous step. So really we just have to make sure the package is installed to skip that test. I'll be honest, I would have expected the 'when' clause to prevent that logic from ever running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's gaps in this that have been identified since I did the backport but should we try to be less clever for the sake of clarity? I liked how easy this was to follow in my backport
openshift-ansible/roles/docker/tasks/main.yml
Lines 10 to 65 in 7e39163
# Avoid docker 1.9 when installing origin < 1.2 or OSE < 3.2 on RHEL/Centos and | |
# See: https://bugzilla.redhat.com/show_bug.cgi?id=1304038 | |
- name: Default to latest docker | |
set_fact: | |
docker_version: '' | |
docker_downgrade: False | |
- name: Gather installed version of common package | |
shell: > | |
rpm -q {{ openshift.common.service_type }} --queryformat '%{version}' | cut -d '.' -f1,2 | grep -v 'not installed' | |
register: installed_common_version | |
failed_when: false | |
changed_when: false | |
- name: Gather available version of common package | |
shell: > | |
yum list available -e 0 -q "{{ openshift.common.service_type}}" 2>&1 | tail -n +2 | awk '{ print $2 }' | sort -r | tr '\n' ' ' | tail -n 1 | cut -d '.' -f1,2 | |
register: available_common_version | |
failed_when: false | |
changed_when: false | |
- name: Gather installed version of docker | |
shell: > | |
rpm -q docker --queryformat '%{version}' | cut -d '.' -f1,2 | grep -v 'not installed' | |
register: installed_docker_version | |
failed_when: false | |
changed_when: false | |
- name: Gather available version of docker | |
shell: > | |
yum list available -e 0 -q "docker" 2>&1 | tail -n +2 | awk '{ print $2 }' | sort -r | tr '\n' ' ' | tail -n 1 | |
register: available_docker_version | |
failed_when: false | |
changed_when: false | |
- set_fact: | |
docker_version: "{{ '1.8.2' }}" | |
when: " installed_common_version.stdout in ['3.0','3.1','1.0','1.1'] or available_common_version.stdout in ['3.0','3.1','1.0','1.1']" | |
- set_fact: | |
docker_downgrade: True | |
when: " docker_version < installed_docker_version.stdout and docker_version != ''" | |
- name: Remove deferred removal and deletion for 1.8.2 downgrades | |
command: > | |
sed -i 's/--storage-opt dm.use_deferred_deletion=true//' /etc/sysconfig/docker-storage | |
when: docker_downgrade | bool and docker_version == '1.8.2' | |
- name: Downgrade docker | |
command: > | |
{{ ansible_pkg_mgr }} downgrade -y docker{{ '-' + docker_version if docker_version != '' else '' }} | |
when: not openshift.common.is_atomic and docker_downgrade | bool | |
- name: Install docker | |
action: "{{ ansible_pkg_mgr }} name=docker{{ '-' + docker_version if docker_version != '' else '' }} state=present" | |
when: not openshift.common.is_atomic | bool and not docker_downgrade | bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's getting late on a Friday. I'm honestly struggling to see how either of these are easy to follow given all the logic branches for containerized installs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I didn't test with Docker not installed, but with the current logic it failed for the upgrade scenario
I think the conditional in the action needs to use Docker_version and not Docker_version_result. |
@detiber, I don't think that handles the case of docker not being installed. I'll have to check though. |
dc2a6e9 - State: error - Test Context: aos-ci-jenkins/OS_3.1_NOT_containerized - Test Description: Internal Jenkins error during provisioning an 3.1 cluster - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-provision-ose-cluster-728/dc2a6e9a25edbe1fa3ba8324e22f6fcf709a6de7.txt |
dff69c8 - State: success - Test Context: aos-ci-jenkins/OS_unit_tests - Test Description: all unit tests passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-pre-test-386/dff69c863c71d9a4c1e41a2908f87bfce1c1fa15.txt |
dff69c8 - State: error - Test Context: aos-ci-jenkins/OS_3.1_NOT_containerized - Test Description: Internal Jenkins error during provisioning an 3.1 cluster - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-provision-ose-cluster-731/dff69c863c71d9a4c1e41a2908f87bfce1c1fa15.txt |
dff69c8 - State: error - Test Context: aos-ci-jenkins/OS_3.0_NOT_containerized - Test Description: Internal Jenkins error during provisioning an 3.0 cluster - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-provision-ose-cluster-729/dff69c863c71d9a4c1e41a2908f87bfce1c1fa15.txt |
dff69c8 - State: error - Test Context: aos-ci-jenkins/OS_3.1_containerized - Test Description: Internal Jenkins error during provisioning an 3.1 cluster - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-provision-ose-cluster-730/dff69c863c71d9a4c1e41a2908f87bfce1c1fa15.txt |
aos-ci-test |
dff69c8 - State: success - Test Context: aos-ci-jenkins/OS_unit_tests - Test Description: all unit tests passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-pre-test-387/dff69c863c71d9a4c1e41a2908f87bfce1c1fa15.txt |
… containerized env
aos-ci-test |
0879620 - State: success - Test Context: aos-ci-jenkins/OS_unit_tests - Test Description: all unit tests passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-pre-test-388/0879620498b1251f3a375b9a5657c16dd571906a.txt |
45a4e6e - State: error - Test Context: aos-ci-jenkins/OS_3.1_containerized - Test Description: openshift-ansible install failed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-616/45a4e6e468fd0f60c595503d55d155c82a2289ca.txt |
45a4e6e - State: success - Test Context: aos-ci-jenkins/OS_3.1_NOT_containerized - Test Description: openshift-ansible install passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-617/45a4e6e468fd0f60c595503d55d155c82a2289ca.txt |
94416d5 - State: success - Test Context: aos-ci-jenkins/OS_3.0_NOT_containerized - Test Description: openshift-ansible install passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-618/94416d577a355eae25c8e71a283fc0f8ed2b34af.txt |
45a4e6e - State: error - Test Context: aos-ci-jenkins/OS_3.1_NOT_containerized_e2e_tests - Test Description: all e2e tests failed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-e2e-tests-149/45a4e6e468fd0f60c595503d55d155c82a2289ca.txt |
94416d5 - State: error - Test Context: aos-ci-jenkins/OS_3.1_containerized - Test Description: openshift-ansible install failed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-619/94416d577a355eae25c8e71a283fc0f8ed2b34af.txt |
dff69c8 - State: success - Test Context: aos-ci-jenkins/OS_3.1_containerized - Test Description: openshift-ansible install passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-621/dff69c863c71d9a4c1e41a2908f87bfce1c1fa15.txt |
dff69c8 - State: success - Test Context: aos-ci-jenkins/OS_3.1_NOT_containerized - Test Description: openshift-ansible install passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-622/dff69c863c71d9a4c1e41a2908f87bfce1c1fa15.txt |
94416d5 - State: success - Test Context: aos-ci-jenkins/OS_3.1_NOT_containerized - Test Description: openshift-ansible install passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-620/94416d577a355eae25c8e71a283fc0f8ed2b34af.txt |
dff69c8 - State: success - Test Context: aos-ci-jenkins/OS_3.0_NOT_containerized - Test Description: openshift-ansible install passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-623/dff69c863c71d9a4c1e41a2908f87bfce1c1fa15.txt |
0879620 - State: success - Test Context: aos-ci-jenkins/OS_3.0_NOT_containerized - Test Description: openshift-ansible install passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-624/0879620498b1251f3a375b9a5657c16dd571906a.txt |
0879620 - State: success - Test Context: aos-ci-jenkins/OS_3.1_containerized - Test Description: openshift-ansible install passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-625/0879620498b1251f3a375b9a5657c16dd571906a.txt |
94416d5 - State: error - Test Context: aos-ci-jenkins/OS_3.1_NOT_containerized_e2e_tests - Test Description: all e2e tests failed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-e2e-tests-150/94416d577a355eae25c8e71a283fc0f8ed2b34af.txt |
dff69c8 - State: success - Test Context: aos-ci-jenkins/OS_3.1_containerized_e2e_tests - Test Description: all e2e tests passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-e2e-tests-151/dff69c863c71d9a4c1e41a2908f87bfce1c1fa15.txt |
dff69c8 - State: success - Test Context: aos-ci-jenkins/OS_3.1_NOT_containerized_e2e_tests - Test Description: all e2e tests passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-e2e-tests-152/dff69c863c71d9a4c1e41a2908f87bfce1c1fa15.txt |
0879620 - State: success - Test Context: aos-ci-jenkins/OS_3.1_containerized_e2e_tests - Test Description: all e2e tests passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-e2e-tests-153/0879620498b1251f3a375b9a5657c16dd571906a.txt |
0879620 - State: success - Test Context: aos-ci-jenkins/OS_3.1_NOT_containerized - Test Description: openshift-ansible install passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-openshift-ansible-626/0879620498b1251f3a375b9a5657c16dd571906a.txt |
0879620 - State: success - Test Context: aos-ci-jenkins/OS_3.1_NOT_containerized_e2e_tests - Test Description: all e2e tests passed - Logs: https://aos-ci.s3.amazonaws.com/openshift/openshift-ansible/jenkins-openshift-ansible-run-e2e-tests-154/0879620498b1251f3a375b9a5657c16dd571906a.txt |
All 7 ci jobs are passing and I did a ton of manual testing today. I'm merging this to unblock a lot of other testing. |
No description provided.