-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zuul CI configuration #9107
Zuul CI configuration #9107
Conversation
@ianw if you set it to |
Yes, I think so |
@ianw I think this change may need to update |
Oh, nice, the reporting in Zuul UI seems to be set up way better than in other places. |
Ok, so the first run is in https://zuul.opendev.org/t/pypa/build/06bdbc5eafed4c3380fb19ff6801c1b3/log/job-output.txt the failures appear to be related to our use of mirrors. Our test nodes setup pip to use a reverse proxy to a cloud-local mirror; this stops them having to go over the internet as much as possible e.g.
UPDATE : I removed the /etc/pip.conf we write and this seemed to go away. Probably a bug that the help is different |
|
It looks like these failures are very similar to #7785 UPDATE I've added |
I think this might be because we're running 8-way parallel. These tests are probably hitting pypi/fastly in very quick succession so we get rate limited Not sure what is up with
|
3302777
to
7225f26
Compare
recheck |
64f53e7
to
b70952f
Compare
So this now works, and I think is a good proof of concept. Things to note
At this point, the world is our oyster, as they say. This can be updated to run more python versions, or different platforms, or whatever we like. I don't think I'm the right person to drive this, but am certainly happy to help as required @ssbarnea @SeanMooney |
recheck |
1 similar comment
recheck |
This was a very busy week for me but I will look into during the weekend. I was expecting to find pip bugs while setting up the new pipelines. Some may be considered infra related but some are test design bugs (or design limitation). Hopefully we will sort them. For example the fact that testing fails when user has his own pip.conf config is a testing bug, the lack of isolation from machine setup. In fact is a very common bug, I fixes several and also made it. I am always connected to Regarding approach: I would slowly increase testing coverage while addressing the bugs we discover, especially as we also have a day job. I am really pleased on how fast we managed to join efforts and get results. Thanks to everyone! This is, at least for myself a morale booster. |
- name: Run tox integration tests with no network | ||
include_role: | ||
name: tox | ||
vars: | ||
tox_extra_args: '-- -m integration -n auto --use-venv' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can units and integrations run in parallel jobs? Would it make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can. The other existing jobs seem to run separately so it fails earlier. Personally I'd just run it all as you suggest because that way you see all failures your change introduces, without having to run it through multiple times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. The current CI setup resorts to using hacks with early cancellations because of the limited resources. So if Zuul can handle the load, it's better to build a more responsive setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the unit/integration split
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weren't we going to do the opposite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. Is it possible to use pre-baked images to improve the cold start? Like other CIs do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's always room for improvement, but there's probably two parts to that. Part of the setup is cloud and host specific things; stuff that we don't actually know until the node starts like what mirror to point it at and ssh keys.
The other bits are setting up tox and pip. We have very generic jobs because these are widely portable -- people use these building blocks not just on the VM images Opendev builds, but their own images in openshift, k8s, google compute, aws etc etc. For that reason, a lot of the framework in https://zuul-ci.org/docs/zuul-jobs/ is very intentionally lowest-common-denominator.
It's a trade off -- other CI systems probably don't have a goal of making completely re-usable jobs you can roll out to a wide range of very minimal testing environments. So we're always considering that when we build bits of the jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that. But let me rephrase: how hard would it be to make some image from say quay.io run this build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the way that I think you mean, this isn't possible in Opendev as the Opendev node environment is not container-based. Note that Zuul can certainly be setup to provide resources in many different ways, but in Opendev's Zuul instance you are getting a virtual-machine from the images we pre-build daily (qcow images are available @ https://nb01.opendev.org/images/ and I can go into how and why they are built, but I don't think that's the point here). So you can't tell this job to run on an arbitrary container.
You can install docker/podman on the host, and do things inside that like run tasks in whatever container you want. For example, the pypa manylinux builds use this to build wheels in a container; e.g. https://github.com/pyca/cryptography/blob/master/.zuul.playbooks/playbooks/wheel/roles/build-wheel-manylinux/tasks/main.yaml
This is a flexibility win, but because you have to install docker and pull the image, not a speed win for running something like tox.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks! I was just thinking about what we could do in the context of what @pradyunsg wished for when saying that it better be fast.
Massively superficial (and hence I wouldn't class it as a "review") but a couple of comments:
I'd like to see a better example of how this would look "for real" before committing to it. As it stands, it doesn't feel as usable as our existing CI. |
Zuul was considering everything under .zuul.d as job config, but I added https://review.opendev.org/#/c/744811/ to allow projects to do this. I'll have to double check if we have restarted things since that merged to see if it's live, but the answer is yes. The idea is that the Ansible playbooks, however, are fairly abstract, and can be useful to anyone using Ansible. I will admit, this is less important for pip, rather than things like server projects that you tend to deploy, where it makes great sense to be using the same logic for deploying in CI as in production as much as possible.
The build will report in the checks interfaces a link to a build result, e.g. https://zuul.opendev.org/t/pypa/build/94c241f5c2b749bcb0ea4c8d6250142d For the quickest results; click "artifacts -> pytest results" For more info, start by going to "console". There you will see each ansible task run. You can scroll down to "Run tox" which is the task that ... runs tox :) You can click to expand, or click on the magnifying glass (actually, we've had some debate about how to make this more discoverable ... so suggestions welcome!) You can also click on "logs" and see various text-based logs. This has everything, but in the most verbose/least UI-friendly mode. You can always go to https://zuul.opendev.org/t/pypa/builds to see the latest builds that have run |
This moves the playbooks under the Zuul config directory; note the .zuul.ignore to avoid them being read as configuration.
Thanks, that interface is a lot better. Unfamiliar, but it's unfair to criticise a new thing for being unfamiliar, so that's just something I'll have to get used to. I'm still concerned over who's going to maintain this long-term (I have no personal interest in learning zuul) but if we get a good answer to that, then I'm essentially neutral on adding zuul as an extra CI. Ask me again if the proposal is to replace our existing CI with zuul, though 🙂 |
Keep in mind zuul is not allowed to do what it's best at which is avoiding regression since it's not gating the project. If I didn't already have so many other things bogging me down I would be happy to help you out but I don't think I would have (much) time to spend maintaining anything. But you can always find me on #zuul irc if you need any help. |
This makes the console output less cluttered when you go to look at the actual tox run.
@ianw any chance to configure the output with contrast+accessibility? I know that it should be possible to feed custom styles into this plugin... The current color scheme is rather hard to consume. |
- name: Remove OpenDev mirror config | ||
file: | ||
path: /etc/pip.conf | ||
state: absent | ||
become: yes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe set PIP_CONFIG_FILE=/dev/null
env var instead?
cc @pradyunsg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If yes, this could probably be done per-test to make use of the mirrors in tests that don't need to hit the production PyPI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even PIP_ISOLATED=true
(is this a thing?)
I guess you're referring to https://github.com/pytest-dev/pytest-html#appearance but I think that's just an example, as I can't see those CSS files actually defined anywhere? However, if someone wants to write a custom css file, that is most certainly possible to deploy onto the host and have collected in the logs. You just want to use ansible copy: command to put it on the testing host, and collect it in the same way so it's alongside the html log. It seems to me you could download the existing HTML and work on a CSS file locally, and then I'm happy to work with anyone to help deploy it in the job. |
- job: | ||
name: pypa-pip-py38 | ||
parent: pypa-pip-base | ||
nodeset: ubuntu-focal | ||
vars: | ||
tox_envlist: py38 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we look into adding more envs? Other distros?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like that was the initial reasoning for exploring Zuul :) @ssbarnea I think was most involved in that disucssion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I just wasn't sure if it's expected that they'd appear in this PR or follow-ups
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference is probably to not overload this initial support with a bunch more things, but I'm open to suggestion. Note that any new jobs proposed in a pull request are speculatively tested by Zuul, so things will never get into a failure state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean by saying that things won't get into a failure state. I think I may be unfamiliar with what speculative testing actually means in this context. Mind explaining?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is just like this change, any pull requests that modify/add/remove jobs in .zuul.d will be tested by Zuul with that config applied. i.e. if we add a pull request to add centos/debian/XYZ jobs as a follow-up pull-request, those jobs will run on that pull request before it is committed, and we can iterate, debug, etc. That's what I mean by speculative testing; the proposed config changes are tested by Zuul from the pull request, without being committed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so it's basically the same as what other CIs do... I didn't know there's a special term for this behavior.
To set expectations about the timeline here: I think I can come back to this in a few days after 20.3 is out. |
... and to set expectations about the next steps: I think the pip developers still need a discussion about whether we even want to use zuul before we move forward with this. And getting enough of the pip devs to comment will take more time than just having any one of us be available to review this PR. |
In the light of #9087 (comment), I think it's time to revisit this effort. |
@pradyunsg do you think this effort could be resurrected anytime soon? |
@ianw the Zuul logs have been cleaned up. Maybe rebase this? |
I think the next steps here are outlined in #7279 (comment), and I won't have time in the near future to look into this. |
Closing this for now. As an update, we've since consolidated pip's entire CI into GitHub Actions. I still think it's valuable to have additional CI resources but this is basically blocked on a discussion/decision, as noted in my previous comment. I hope this isn't percieved as a "no, never" but more along the lines of acknowledging that we're still a few steps away from saying "Yes" if we're going to. Thanks all for the discussions and effort toward this so far! ^.^ |
No description provided.