Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd-sign should handle lost connections gracefully #1072

Closed
jlebon opened this issue Jan 24, 2020 · 1 comment
Closed

cmd-sign should handle lost connections gracefully #1072

jlebon opened this issue Jan 24, 2020 · 1 comment

Comments

@jlebon
Copy link
Member

jlebon commented Jan 24, 2020

FCOS pipeline hit this while signing images:

+ cosa sign robosignatory --s3 fcos-builds/prod/streams/testing-devel/builds --extra-fedmsg-keys stream=testing-devel --images --gpgkeypath /etc/pki/rpm-gpg --fedmsg-conf /etc/fedora-messaging-cfg/fedmsg.toml
Successfully started consumer thread
Sending artifacts-sign request for build 31.20200121.20.1
Waiting for response from RoboSignatory
The connection to the broker was lost (ConnectionLost('Connection lost')), consumer halted; the connection should restart and consuming will resume.
Traceback (most recent call last):
  File "/usr/lib/coreos-assembler/cmd-sign", line 380, in <module>
    sys.exit(main())
  File "/usr/lib/coreos-assembler/cmd-sign", line 64, in main
    args.func(args)
  File "/usr/lib/coreos-assembler/cmd-sign", line 124, in cmd_robosignatory
    robosign_images(args, s3, cond)
  File "/usr/lib/coreos-assembler/cmd-sign", line 232, in robosign_images
    validate_response(cond)
  File "/usr/lib/coreos-assembler/cmd-sign", line 312, in validate_response
    raise Exception("Timed out waiting for RoboSignatory")
Exception: Timed out waiting for RoboSignatory

I think what happened there is the consumer didn't actually resume watching for the finished request after the ConnectionLost happened, so we timed out. Need to investigate if we're supposed to handle this in our code or if fedora-messaging itself is supposed to do this as the error implies.

jcajka pushed a commit to jcajka/coreos-assembler that referenced this issue Mar 24, 2020
platform/unprivqemu: Drop restrict=yes
@jlebon
Copy link
Member Author

jlebon commented Sep 21, 2023

We haven't seen this in a while. It might predate the move to fedora-messaging? Anyway, let's close and we can always reopen.

@jlebon jlebon closed this as completed Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant