-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when passing network id for connecting a network to a container #9451
Comments
This doesn't look REST API related. It looks like the network connect call is corrupting the state (specifically network results blocks). Please provide more details on the test container. |
OK I see the problem. My network ID magic works only on the libpod side. Network connect checks if the networks exists and this passes because it is a valid ID but when we pass the ID as network name to OCICNI it will fail. At this point we already added the network to the state which causes problems with podman inspect and network ls because the the state and cni information no longer matches. I think the same problem exists for a plain podman run --network ID ... |
@Luap99 thanks for the quick fix. I copied your latest files for testing, it is working for most cases. However, it failed in our program that uses
The following is the output:
|
@Luap99 Btw, if you run above script one command at a time, it works well. |
@linggao Thanks for the script. I cannot reproduce but that's somewhat expected. This is a race condition. I see this a lot in our CI. I already opened a PR cri-o/ocicni#85 which should fix this problem. |
The libpod network logic knows about networks IDs but OCICNI does not. We cannot pass the network ID to OCICNI. Instead we need to make sure we only use network names internally. This is also important for libpod since we also only store the network names in the state. If we would add a ID there the same networks could accidentally be added twice. Fixes containers#9451 Signed-off-by: Paul Holzinger <[email protected]>
@Luap99 thanks for the info. Just let you know that the error happens all the time on my vm with the script. I have RHEL 8.3 with the latest podman code + this PR. Should I upgrade cri-o? How do I do it? |
I see in your podman info output that you only have two cores. The problem is that the cri-o library updates the network list asynchronously with fsnotfiy. On a slow system this might happen after we already try to use the new net. You cannot update the OCICNI library as this is directly compiled into podman. |
@Luap99 I saw this PR is merged. |
also see #9449
|
@Luap99 thanks a lot for all of your help. I built ocicni fix you have made with podman, the initial tests in our code looks very good. |
@linggao Open a BZ on the Red Hat Bugzilla, targeted against Podman on RHEL 8.4, for this issue. We can use that to justify the backport. For reference, upstream 3.0.2 is probably landing sometime next week (Wednesday/Thursday seems likely?) but RHEL 8.4 will be staying on 3.0.1 with selective backports. |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
podman REST API is not compatible with docker REST API for
/networks/{id or name}/connect
and/networks/{id or name}/connect
.Our code connects a network to a container with docker REST API https://docs.docker.com/engine/api/v1.41/#operation/NetworkConnect through
godockerclient
. This api allows us to use either network id or network name. However, when calling the same api on /var/run/podman/podman.sock with network id, not only the error occurs but also the network and the container can no longer be queried.Steps to reproduce the issue:
podman network create foo-a
podman run --name test --network foo-a -d alpine sleep 1000
podman network create foo-b
get network id for foo-b
curl -sSLw "%{http_code}" --unix-socket /var/run/podman/podman.sock http://localhost/networks | jq
podman network connect {network-id-for-foo-b} test
or
read -d '' sdef <<EOF
{
"Container":"{id-for-container-test}",
"EndpointConfig": {
}
}
EOF
echo "$sdef" | curl -sLX POST --data @- -H "Content-Type: application/json" -H "Accept: application/json" --unix-socket /var/run/podman/podman.sock http://localhost/networks/{network-id-for-foo-b}/connect
podman inspect test
Describe the results you received:
# echo "$sdef" | curl -sLX POST --data @- -H "Content-Type: application/json" -H "Accept: application/json" --unix-socket /var/run/podman/podman.sock http://localhost/networks/e7953c3a62f54803cbfbbe9db231d8895b12b06f4d493d8d56c333bebb3b6e09/connect
{"cause":"CNI network "e7953c3a62f54803cbfbbe9db231d8895b12b06f4d493d8d56c333bebb3b6e09" not found","message":"CNI network "e7953c3a62f54803cbfbbe9db231d8895b12b06f4d493d8d56c333bebb3b6e09" not found","response":500}
# podman inspect test
Error: network inspection mismatch: asked to join 2 CNI network(s) [e7953c3a62f54803cbfbbe9db231d8895b12b06f4d493d8d56c333bebb3b6e09 foo-a], but have information on 1 network(s): internal libpod error
# curl -sSLw "%{http_code}" --unix-socket /var/run/podman/podman.sock http://localhost/networks | jq
{
"cause": "internal libpod error",
"message": "network inspection mismatch: asked to join 2 CNI network(s) [e7953c3a62f54803cbfbbe9db231d8895b12b06f4d493d8d56c333bebb3b6e09 foo-a], but have information on 1 network(s): internal libpod error",
"response": 500
}
Describe the results you expected:
podman REST API and CLI should take either network id or name when connecting a network or disconnecting to a container.
This is the behavior of docker and podman REST APIs claims to be compatible with docker REST APIs.
Additional information you deem important (e.g. issue happens only occasionally):
Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
rpm -q podman
orapt list podman
):Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?
Yes
Additional environment details (AWS, VirtualBox, physical, etc.):
RHEL 8.3
The text was updated successfully, but these errors were encountered: