Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows: system connection: unexpected ports #22844

Closed
edsantiago opened this issue May 29, 2024 · 6 comments · Fixed by #23154
Closed

windows: system connection: unexpected ports #22844

edsantiago opened this issue May 29, 2024 · 6 comments · Fixed by #23154
Labels
flakes Flakes from Continuous Integration locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine windows issue/bug on Windows

Comments

@edsantiago
Copy link
Member

edsantiago commented May 29, 2024

  start machine with conflict on SSH port
...
[FAILED] Expected
      <[]string | len:4, cap:4>: ["50443", "50443", "50229", "50229"]    <<<<< 50443 is in there
  to contain element matching
      <string>: 50443
  In [It] at: C:/Users/Administrator/AppData/Local/cirrus-ci-build/repo[/pkg/machine/e2e/start_test.go:113](https://github.com/containers/podman/blob/1093ebb72be3117c5312b702ea89ecfcab5d3452/pkg/machine/e2e/start_test.go#L113) @ 05/28/24 12:52:44.223

Maybe it's a windows ^M thing?
My misunderstanding. The error message means "I want 50443 and ONLY 50443, nothing else". The 50229 is what's causing the problem.

  • windows : machine-hyperv podman windows rootless host sqlite
    • PR test/system: make some tests faster part 1 #22821
      • 05-28 08:59 in podman machine start C:/Users/Administrator/AppData/Local/cirrus-ci-build/repo/pkg/machine/e2e/start_test.go:16 start machine with conflict on SSH port C:/Users/Administrator/AppData/Local/cirrus-ci-build/repo/pkg/machine/e2e/start_test.go:94
    • PR update golangci-lint to v1.59.0 #22815
      • 05-27 07:03 in podman machine start C:/Users/Administrator/AppData/Local/cirrus-ci-build/repo/pkg/machine/e2e/start_test.go:16 start machine with conflict on SSH port C:/Users/Administrator/AppData/Local/cirrus-ci-build/repo/pkg/machine/e2e/start_test.go:94
    • PR libpod: wait another interval for healthcheck #22764
      • 05-27 08:07 in podman machine start C:/Users/Administrator/AppData/Local/cirrus-ci-build/repo/pkg/machine/e2e/start_test.go:16 start machine with conflict on SSH port C:/Users/Administrator/AppData/Local/cirrus-ci-build/repo/pkg/machine/e2e/start_test.go:94
    • PR [v5.1] applehv: Rosetta support #22757
      • 05-20 11:21 in podman machine start C:/Users/Administrator/AppData/Local/cirrus-ci-build/repo/pkg/machine/e2e/start_test.go:16 start machine with conflict on SSH port C:/Users/Administrator/AppData/Local/cirrus-ci-build/repo/pkg/machine/e2e/start_test.go:94
x x x x x x
machine-hyperv(4) podman(4) windows(4) rootless(4) host(4) sqlite(4)
@edsantiago edsantiago added flakes Flakes from Continuous Integration machine labels May 29, 2024
@Luap99
Copy link
Member

Luap99 commented Jun 25, 2024

Maybe it's a windows ^M thing?

Mhh yeah some non printable char sounds logical but why would that flake? I guess such things should be consistent.

Still seeing this? I guess I can throw up a PR to trim spaces from the string and then we can see if it sill reproduces in the future.

@Luap99
Copy link
Member

Luap99 commented Jun 25, 2024

Actually I now looked at the code and the issue is not a invisible char. The code matches
Expect(connectionPorts).To(HaveEach(inspectPort))

This means all connections should have the same port. And per the code there we only created a single machine so why do we have connections from another machine still in there? This sounds to be more like an issue with invalid cleanup.

Given that other tests in the log failed with

  Error: unable to connect to Podman socket: failed to read identity "C:\\Users\\ADMINI~1\\AppData\\Local\\Temp\\podman_test2399464602\\.local\\share\\containers\\podman\\machine\\machine": open C:\Users\ADMINI~1\AppData\Local\Temp\podman_test2399464602\.local\share\containers\podman\machine\machine: The system cannot find the path specified.

So I assume something more general is broken in the windows testing, I would think that relates directly to #22843.

@edsantiago
Copy link
Member Author

Still happening as of last week. This is an infrequent one:

@Luap99 Luap99 added the windows issue/bug on Windows label Jun 25, 2024
@edsantiago edsantiago changed the title windows: expected array [X Y Z] to contain X windows: system connection: unexpected ports Jun 27, 2024
@edsantiago
Copy link
Member Author

Can someone instrument

newMachineEvent(events.Init, events.Event{Name: initOpts.Name})
fmt.Println("Machine init complete")
so it spits out the system connection info?

@Luap99
Copy link
Member

Luap99 commented Jul 1, 2024

I can but what is the point? We can add podman system connection list to tests Befre/AfterEach if you want to debug this. But given all logs so far show at least one other failure before I am pretty sure that issue is caused by the failing cleanup in the other tests.

Most likely the best fix is to make use of PODMAN_CONNECTIONS_CONF env to use a new tmp file for each test as connection storage, that way we can ensure it never leaks.

@Luap99
Copy link
Member

Luap99 commented Jul 1, 2024

Looking at it there are more problems in teardown()

One weird failure that is not shown at all because it is not considered fatal

  C> podman.exe machine rm --force ac24eebe6478
  Error: failure advancing enumeration (2147749924)

Luap99 added a commit to Luap99/libpod that referenced this issue Jul 1, 2024
Currently all podman machine rm errors in AfterEach were ignored.
This means some leaked and caused issues later on, see containers#22844.

To fix it first rework the logic to only remove machines when needed at
the place were they are created using DeferCleanup(), however
DeferCleanup() does not work well together with AfterEach() as it always
run AfterEach() before DeferCleanup(). As AfterEach() deletes the dir
the podman machine rm call can not be done afterwards.

As such migrate all cleanup to use DeferCleanup() and while I have to
touch this fix the code to remove the per file duplciation and define
the setup/cleanup once in the global scope.

Signed-off-by: Paul Holzinger <[email protected]>
@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Oct 1, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Oct 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
flakes Flakes from Continuous Integration locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine windows issue/bug on Windows
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants