Zuul CI configuration #9107

ianw · 2020-11-06T00:17:44Z

No description provided.

webknjaz · 2020-11-06T00:35:52Z

@ianw if you set it to Draft, will Zuul still test it?

ianw · 2020-11-06T00:37:16Z

@ianw if you set it to Draft, will Zuul still test it?

Yes, I think so

.zuul.playbooks/playbooks/tox/main.yaml

webknjaz · 2020-11-06T00:40:40Z

@ianw I think this change may need to update MANIFEST.in too, in order to satisfy the linter. @pradyunsg probably knows better tho.

webknjaz · 2020-11-06T00:42:35Z

Oh, nice, the reporting in Zuul UI seems to be set up way better than in other places.

ianw · 2020-11-06T00:47:49Z

Ok, so the first run is in

https://zuul.opendev.org/t/pypa/build/06bdbc5eafed4c3380fb19ff6801c1b3/log/job-output.txt

the failures appear to be related to our use of mirrors. Our test nodes setup pip to use a reverse proxy to a cloud-local mirror; this stops them having to go over the internet as much as possible

e.g.

2020-11-06 00:38:47.905346 | ubuntu-focal | E           AssertionError: assert '\nUsage:   \...the future.\n' == '\nUsage:   \...the future.\n'
2020-11-06 00:38:47.905359 | ubuntu-focal | E             Skipping 9514 identical leading characters in diff, use -v to show
2020-11-06 00:38:47.905375 | ubuntu-focal | E             +   https://pypi.org/simple). This should point to a
2020-11-06 00:38:47.905388 | ubuntu-focal | E             -   https://mirror.kna1.airship-
2020-11-06 00:38:47.905400 | ubuntu-focal | E             -                               citycloud.opendev.org/pypi/simple). This should
2020-11-06 00:38:47.905413 | ubuntu-focal | E             -                               point to a repository compliant with PEP 503
2020-11-06 00:38:47.905426 | ubuntu-focal | E             ?                              -----------
2020-11-06 00:38:47.905438 | ubuntu-focal | E             +                               repository compliant with PEP 503 (the simple...
2020-11-06 00:38:47.905451 | ubuntu-focal | E
2020-11-06 00:38:47.905463 | ubuntu-focal | E             ...Full output truncated (63 lines hidden), use '-vv' to show

UPDATE : I removed the /etc/pip.conf we write and this seemed to go away. Probably a bug that the help is different

ianw · 2020-11-06T00:53:54Z

tests/functional/test_help.py::test_help_commands_equally_functional
tests/functional/test_install.py::test_basic_install_from_pypi - Asser...
tests/functional/test_install_user.py::Tests_UserSite::test_upgrade_user_conflict_in_globalsite
tests/functional/test_install_user.py::Tests_UserSite::test_install_user_conflict_in_globalsite_and_usersite
tests/functional/test_install_user.py::Tests_UserSite::test_install_user_conflict_in_globalsite
tests/functional/test_uninstall_user.py::Tests_UninstallUserSite::test_uninstall_from_usersite_with_dist_in_global_site
tests/functional/test_new_resolver_user.py::test_new_resolver_install_user_reinstall_global_site
tests/functional/test_new_resolver_user.py::test_new_resolver_install_user_conflict_in_global_site
tests/functional/test_new_resolver_user.py::test_new_resolver_install_user_conflict_in_global_and_user_sites
tests/functional/test_search.py::test_run_method_should_return_no_matches_found_when_does_not_find_pkgs

ianw · 2020-11-06T01:28:17Z

It looks like these failures are very similar to #7785

UPDATE I've added --use-venv which has made these go away for now

ianw · 2020-11-06T01:54:03Z

test_run_method_should_return_no_matches_found_when_does_not_find_pkgs() fails with

xmlrpc.client.Fault: <Fault -32500: 'HTTPTooManyRequests: The action could not be performed because there were too many requests by the client. Limit may reset in 1 seconds.'>

I think this might be because we're running 8-way parallel. These tests are probably hitting pypi/fastly in very quick succession so we get rate limited

Not sure what is up with test_search_exit_status_code_when_finds_no_package

2020-11-06 01:40:26.475550 | ubuntu-focal | [gw5] linux -- Python 3.8.5 /home/zuul/src/github.com/pypa/pip/.tox/py38/bin/python
2020-11-06 01:40:26.475561 | ubuntu-focal |
2020-11-06 01:40:26.475572 | ubuntu-focal | script = <tests.lib.PipTestEnvironment object at 0x7f4eeae5fc10>
2020-11-06 01:40:26.475583 | ubuntu-focal |
2020-11-06 01:40:26.475593 | ubuntu-focal |     @pytest.mark.network
2020-11-06 01:40:26.475604 | ubuntu-focal |     def test_search_exit_status_code_when_finds_no_package(script):
2020-11-06 01:40:26.475615 | ubuntu-focal |         """
2020-11-06 01:40:26.475626 | ubuntu-focal |         Test search exit status code for no matches
2020-11-06 01:40:26.475636 | ubuntu-focal |         """
2020-11-06 01:40:26.475657 | ubuntu-focal |         result = script.pip('search', 'nonexistentpackage', expect_error=True)
2020-11-06 01:40:26.475672 | ubuntu-focal | >       assert result.returncode == NO_MATCHES_FOUND, result.returncode
2020-11-06 01:40:26.475683 | ubuntu-focal | E       AssertionError: 2
2020-11-06 01:40:26.475694 | ubuntu-focal | E       assert 2 == 23
2020-11-06 01:40:26.475705 | ubuntu-focal | E        +  where 2 = <tests.lib.TestPipResult object at 0x7f4eeae5f580>.returncode

ianw · 2020-11-06T04:57:28Z

recheck

ianw · 2020-11-06T06:25:13Z

So this now works, and I think is a good proof of concept. Things to note

I've had to remove our pip configuration to use cloud-local pypi reverse proxies that we setup to avoid CI jobs hitting the internet. This is setup by our base jobs at mirror.yaml and the template is pip.conf.j2 I feel like this is a bug in the pip testing that it doesn't work in tests/functional/test_help.py::test_help_commands_equally_functional and tests/functional/test_install.py::test_basic_install_from_pypi
It needed the --use-venv flag ... this went a bit over my head but looks like Tests fail when executing test_install_user.py locally #7785
A few pauses were required in tests/functional/test_search.py to avoid hitting pypi rate limits. Not sure if existing CI is just slower, or perhaps runs in cloud providers that have some agreement with fastly that prevents this. This happened even if I ran the network marked tests sequentially.
test_sort_best_candidate__* appeared to fail in unit tests @ here but has not appeared on re-runs (update: another and now another) I think there is probably a small race lurking somewhere there
You can see all the job runs @ https://zuul.opendev.org/t/pypa/builds. More data needed but looks like a run is ~5m:30s
It is a manual process, but we can put nodes on hold and add people's ssh keys if we need to debug things on the host. Zuul also watches for a comment "recheck" and will re-run testing if seen (this is configured @ https://opendev.org/pypa/project-config/src/branch/master/zuul.d/pipelines.yaml#L38). Easiest way to do this is to reach out in freenode #opendev where admins live.

At this point, the world is our oyster, as they say. This can be updated to run more python versions, or different platforms, or whatever we like. I don't think I'm the right person to drive this, but am certainly happy to help as required @ssbarnea @SeanMooney

ianw · 2020-11-06T07:04:34Z

recheck

ianw · 2020-11-06T07:11:31Z

recheck

ssbarnea · 2020-11-06T08:57:24Z

This was a very busy week for me but I will look into during the weekend. I was expecting to find pip bugs while setting up the new pipelines.

Some may be considered infra related but some are test design bugs (or design limitation). Hopefully we will sort them.

For example the fact that testing fails when user has his own pip.conf config is a testing bug, the lack of isolation from machine setup. In fact is a very common bug, I fixes several and also made it.

I am always connected to #opendev and #pypa channels using @zbr nick, feel free to ping me there for faster responses. It would be very useful to see one or two pip core team members watching this ticket.

Regarding approach: I would slowly increase testing coverage while addressing the bugs we discover, especially as we also have a day job.

I am really pleased on how fast we managed to join efforts and get results. Thanks to everyone! This is, at least for myself a morale booster.

webknjaz · 2020-11-06T12:57:14Z

.zuul.playbooks/playbooks/tox/main.yaml

+    - name: Run tox integration tests with no network
+      include_role:
+        name: tox
+      vars:
+        tox_extra_args: '-- -m integration -n auto --use-venv'


Can units and integrations run in parallel jobs? Would it make sense?

They can. The other existing jobs seem to run separately so it fails earlier. Personally I'd just run it all as you suggest because that way you see all failures your change introduces, without having to run it through multiple times

Yep. The current CI setup resorts to using hacks with early cancellations because of the limited resources. So if Zuul can handle the load, it's better to build a more responsive setup.

removed the unit/integration split

Weren't we going to do the opposite?

Fair enough. Is it possible to use pre-baked images to improve the cold start? Like other CIs do.

There's always room for improvement, but there's probably two parts to that. Part of the setup is cloud and host specific things; stuff that we don't actually know until the node starts like what mirror to point it at and ssh keys.

The other bits are setting up tox and pip. We have very generic jobs because these are widely portable -- people use these building blocks not just on the VM images Opendev builds, but their own images in openshift, k8s, google compute, aws etc etc. For that reason, a lot of the framework in https://zuul-ci.org/docs/zuul-jobs/ is very intentionally lowest-common-denominator.

It's a trade off -- other CI systems probably don't have a goal of making completely re-usable jobs you can roll out to a wide range of very minimal testing environments. So we're always considering that when we build bits of the jobs.

I understand that. But let me rephrase: how hard would it be to make some image from say quay.io run this build?

In the way that I think you mean, this isn't possible in Opendev as the Opendev node environment is not container-based. Note that Zuul can certainly be setup to provide resources in many different ways, but in Opendev's Zuul instance you are getting a virtual-machine from the images we pre-build daily (qcow images are available @ https://nb01.opendev.org/images/ and I can go into how and why they are built, but I don't think that's the point here). So you can't tell this job to run on an arbitrary container.

You can install docker/podman on the host, and do things inside that like run tasks in whatever container you want. For example, the pypa manylinux builds use this to build wheels in a container; e.g. https://github.com/pyca/cryptography/blob/master/.zuul.playbooks/playbooks/wheel/roles/build-wheel-manylinux/tasks/main.yaml

This is a flexibility win, but because you have to install docker and pull the image, not a speed win for running something like tox.

Got it, thanks! I was just thinking about what we could do in the context of what @pradyunsg wished for when saying that it better be fast.

news/9103.feature.rst

tests/functional/test_search.py

pfmoore · 2020-11-10T08:09:46Z

Massively superficial (and hence I wouldn't class it as a "review") but a couple of comments:

Could we put everything under a single .zuul directory (maybe .zuul.d -> .zuul and .zuul.playbooks -> .zuul/playbooks) to reduce clutter in the top-level directory? On Windows, directories named with a dot aren't hidden...
There's only one link to an example of output that I could find. It's basically just a textfile, so it's much harder to read (or find the important bits) than our other CI systems (which highlight "important" bits). And when I tried clicking around in the interface on that page, I couldn't find any sort of "just pip" view, so I rapidly got lost in the various jobs and projects (the "projects" page for pypa/pip is empty).

I'd like to see a better example of how this would look "for real" before committing to it. As it stands, it doesn't feel as usable as our existing CI.

ianw · 2020-11-10T08:23:46Z

1. Could we put everything under a single `.zuul` directory (maybe `.zuul.d` -> `.zuul` and `.zuul.playbooks` -> `.zuul/playbooks`) to reduce clutter in the top-level directory? On Windows, directories named with a dot aren't hidden...

Zuul was considering everything under .zuul.d as job config, but I added https://review.opendev.org/#/c/744811/ to allow projects to do this. I'll have to double check if we have restarted things since that merged to see if it's live, but the answer is yes. The idea is that the Ansible playbooks, however, are fairly abstract, and can be useful to anyone using Ansible. I will admit, this is less important for pip, rather than things like server projects that you tend to deploy, where it makes great sense to be using the same logic for deploying in CI as in production as much as possible.

2. There's only one link to an example of output that I could find. It's basically just a textfile, so it's much harder to read (or find the important bits) than our other CI systems (which highlight "important" bits). And when I tried clicking around in the interface on that page, I couldn't find any sort of "just pip" view, so I rapidly got lost in the various jobs and projects (the "projects" page for pypa/pip is empty).

The build will report in the checks interfaces a link to a build result, e.g.

https://zuul.opendev.org/t/pypa/build/94c241f5c2b749bcb0ea4c8d6250142d

For the quickest results; click "artifacts -> pytest results"

For more info, start by going to "console". There you will see each ansible task run. You can scroll down to "Run tox" which is the task that ... runs tox :) You can click to expand, or click on the magnifying glass (actually, we've had some debate about how to make this more discoverable ... so suggestions welcome!)

You can also click on "logs" and see various text-based logs. This has everything, but in the most verbose/least UI-friendly mode.

You can always go to https://zuul.opendev.org/t/pypa/builds to see the latest builds that have run

This moves the playbooks under the Zuul config directory; note the .zuul.ignore to avoid them being read as configuration.

pfmoore · 2020-11-10T08:42:13Z

The build will report in the checks interfaces a link to a build result, e.g.

Thanks, that interface is a lot better. Unfamiliar, but it's unfair to criticise a new thing for being unfamiliar, so that's just something I'll have to get used to.

I'm still concerned over who's going to maintain this long-term (I have no personal interest in learning zuul) but if we get a good answer to that, then I'm essentially neutral on adding zuul as an extra CI. Ask me again if the proposal is to replace our existing CI with zuul, though 🙂

albinvass · 2020-11-10T16:09:12Z

I'm still concerned over who's going to maintain this long-term (I have no personal interest in learning zuul) but if we get a good answer to that, then I'm essentially neutral on adding zuul as an extra CI. Ask me again if the proposal is to replace our existing CI with zuul, though slightly_smiling_face

Keep in mind zuul is not allowed to do what it's best at which is avoiding regression since it's not gating the project.

If I didn't already have so many other things bogging me down I would be happy to help you out but I don't think I would have (much) time to spend maintaining anything. But you can always find me on #zuul irc if you need any help.

.zuul.d/playbooks/tox/main.yaml

This makes the console output less cluttered when you go to look at the actual tox run.

webknjaz · 2020-11-10T19:22:28Z

Add pytest-html and collect with Zuul as artifact

@ianw any chance to configure the output with contrast+accessibility? I know that it should be possible to feed custom styles into this plugin... The current color scheme is rather hard to consume.

.zuul.d/jobs.yaml

webknjaz · 2020-11-10T19:31:46Z

.zuul.d/playbooks/tox/main.yaml

+    - name: Remove OpenDev mirror config
+      file:
+        path: /etc/pip.conf
+        state: absent
+      become: yes


Maybe set PIP_CONFIG_FILE=/dev/null env var instead?
cc @pradyunsg

If yes, this could probably be done per-test to make use of the mirrors in tests that don't need to hit the production PyPI.

Or even PIP_ISOLATED=true (is this a thing?)

.zuul.d/playbooks/tox/main.yaml

ianw · 2020-11-10T19:32:32Z

@ianw any chance to configure the output with contrast+accessibility? I know that it should be possible to feed custom styles into this plugin... The current color scheme is rather hard to consume.

I guess you're referring to https://github.com/pytest-dev/pytest-html#appearance but I think that's just an example, as I can't see those CSS files actually defined anywhere?

However, if someone wants to write a custom css file, that is most certainly possible to deploy onto the host and have collected in the logs. You just want to use ansible copy: command to put it on the testing host, and collect it in the same way so it's alongside the html log. It seems to me you could download the existing HTML and work on a CSS file locally, and then I'm happy to work with anyone to help deploy it in the job.

webknjaz · 2020-11-10T21:18:55Z

.zuul.d/jobs.yaml

+- job:
+    name: pypa-pip-py38
+    parent: pypa-pip-base
+    nodeset: ubuntu-focal
+    vars:
+      tox_envlist: py38


Should we look into adding more envs? Other distros?

I feel like that was the initial reasoning for exploring Zuul :) @ssbarnea I think was most involved in that disucssion

Yes, I just wasn't sure if it's expected that they'd appear in this PR or follow-ups

My preference is probably to not overload this initial support with a bunch more things, but I'm open to suggestion. Note that any new jobs proposed in a pull request are speculatively tested by Zuul, so things will never get into a failure state

I'm not sure what you mean by saying that things won't get into a failure state. I think I may be unfamiliar with what speculative testing actually means in this context. Mind explaining?

What I mean is just like this change, any pull requests that modify/add/remove jobs in .zuul.d will be tested by Zuul with that config applied. i.e. if we add a pull request to add centos/debian/XYZ jobs as a follow-up pull-request, those jobs will run on that pull request before it is committed, and we can iterate, debug, etc. That's what I mean by speculative testing; the proposed config changes are tested by Zuul from the pull request, without being committed.

Oh, so it's basically the same as what other CIs do... I didn't know there's a special term for this behavior.

.zuul.d/jobs.yaml

pradyunsg · 2020-11-20T13:59:47Z

To set expectations about the timeline here: I think I can come back to this in a few days after 20.3 is out.

pfmoore · 2020-11-20T14:06:24Z

... and to set expectations about the next steps: I think the pip developers still need a discussion about whether we even want to use zuul before we move forward with this. And getting enough of the pip devs to comment will take more time than just having any one of us be available to review this PR.

webknjaz · 2021-02-03T14:01:51Z

In the light of #9087 (comment), I think it's time to revisit this effort.

webknjaz · 2021-04-02T12:04:24Z

@pradyunsg do you think this effort could be resurrected anytime soon?

webknjaz · 2021-04-02T12:06:56Z

@ianw the Zuul logs have been cleaned up. Maybe rebase this?

pradyunsg · 2021-04-02T12:08:27Z

I think the next steps here are outlined in #7279 (comment), and I won't have time in the near future to look into this.

pradyunsg · 2022-02-25T12:23:23Z

Closing this for now.

As an update, we've since consolidated pip's entire CI into GitHub Actions. I still think it's valuable to have additional CI resources but this is basically blocked on a discussion/decision, as noted in my previous comment. I hope this isn't percieved as a "no, never" but more along the lines of acknowledging that we're still a few steps away from saying "Yes" if we're going to.

Thanks all for the discussions and effort toward this so far! ^.^

ianw force-pushed the zuul branch from a5e9ae9 to dadab77 Compare November 6, 2020 00:29

ianw marked this pull request as draft November 6, 2020 00:37

webknjaz reviewed Nov 6, 2020

View reviewed changes

.zuul.playbooks/playbooks/tox/main.yaml Outdated Show resolved Hide resolved

ianw force-pushed the zuul branch from dadab77 to 226607c Compare November 6, 2020 01:03

ianw force-pushed the zuul branch from 226607c to 362b80e Compare November 6, 2020 01:34

ianw force-pushed the zuul branch 3 times, most recently from 3302777 to 7225f26 Compare November 6, 2020 02:42

ianw force-pushed the zuul branch 4 times, most recently from 64f53e7 to b70952f Compare November 6, 2020 06:23

ianw changed the title ~~[Not for merge yet] Zuul CI configuration~~ Zuul CI configuration Nov 6, 2020

webknjaz reviewed Nov 6, 2020

View reviewed changes

news/9103.feature.rst Outdated Show resolved Hide resolved

webknjaz reviewed Nov 6, 2020

View reviewed changes

tests/functional/test_search.py Outdated Show resolved Hide resolved

webknjaz mentioned this pull request Nov 8, 2020

Prepare initial Zuul CI setup #9103

Closed

4 tasks

Move Zuul playbooks under .zuul.d

ec74302

This moves the playbooks under the Zuul config directory; note the .zuul.ignore to avoid them being read as configuration.

ianw force-pushed the zuul branch from dcedf12 to ec74302 Compare November 10, 2020 08:40

albinvass reviewed Nov 10, 2020

View reviewed changes

.zuul.d/playbooks/tox/main.yaml Outdated Show resolved Hide resolved

Zuul: run tox install in pre step

1ebdd81

This makes the console output less cluttered when you go to look at the actual tox run.

webknjaz reviewed Nov 10, 2020

View reviewed changes

.zuul.d/jobs.yaml Outdated Show resolved Hide resolved

webknjaz reviewed Nov 10, 2020

View reviewed changes

.zuul.d/playbooks/tox/main.yaml Outdated Show resolved Hide resolved

webknjaz reviewed Nov 10, 2020

View reviewed changes

webknjaz reviewed Nov 11, 2020

View reviewed changes

.zuul.d/jobs.yaml Outdated Show resolved Hide resolved

Zuul CI : rename task to reflect it does all tox testing

994f1f5

ianw force-pushed the zuul branch from 1e83d82 to 994f1f5 Compare November 11, 2020 23:58

webknjaz approved these changes Nov 18, 2020

View reviewed changes

webknjaz requested a review from pradyunsg February 3, 2021 14:01

pradyunsg removed their request for review April 2, 2021 12:07

github-actions bot added the needs rebase or merge PR has conflicts with current master label Aug 15, 2021

pradyunsg closed this Feb 25, 2022

github-actions bot locked as resolved and limited conversation to collaborators Mar 13, 2022

Zuul CI configuration #9107

Zuul CI configuration #9107

Conversation

ianw commented Nov 6, 2020

webknjaz commented Nov 6, 2020

ianw commented Nov 6, 2020

webknjaz commented Nov 6, 2020

webknjaz commented Nov 6, 2020

ianw commented Nov 6, 2020 • edited Loading

ianw commented Nov 6, 2020

ianw commented Nov 6, 2020 • edited Loading

ianw commented Nov 6, 2020 • edited Loading

ianw commented Nov 6, 2020

ianw commented Nov 6, 2020 • edited Loading

ianw commented Nov 6, 2020

ianw commented Nov 6, 2020

ssbarnea commented Nov 6, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianw Nov 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianw Nov 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pfmoore commented Nov 10, 2020

ianw commented Nov 10, 2020 • edited Loading

pfmoore commented Nov 10, 2020

albinvass commented Nov 10, 2020

webknjaz commented Nov 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianw commented Nov 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pradyunsg commented Nov 20, 2020

pfmoore commented Nov 20, 2020

webknjaz commented Feb 3, 2021

webknjaz commented Apr 2, 2021

webknjaz commented Apr 2, 2021

pradyunsg commented Apr 2, 2021

pradyunsg commented Feb 25, 2022

ianw commented Nov 6, 2020 •

edited

Loading

ianw commented Nov 6, 2020 •

edited

Loading

ianw commented Nov 6, 2020 •

edited

Loading

ianw commented Nov 6, 2020 •

edited

Loading

ianw Nov 10, 2020 •

edited

Loading

ianw Nov 10, 2020 •

edited

Loading

ianw commented Nov 10, 2020 •

edited

Loading