Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better error message required in case error in CNI plugin. #2909

Closed
kunalkushwaha opened this issue Apr 12, 2019 · 17 comments · Fixed by #5008
Closed

better error message required in case error in CNI plugin. #2909

kunalkushwaha opened this issue Apr 12, 2019 · 17 comments · Fixed by #5008
Assignees
Labels
do-not-close kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@kunalkushwaha
Copy link
Collaborator

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug

Description

In case of any error in CNI config file, the error message while creating container with network of CNI plugin simply show error CNI network "<network-name>" not found. This error message is confusing for end user.

A better error message will be helpful to understand where actually error is.

Raised by podman end user:

Steps to reproduce the issue:

  1. Create CNI plugin with some error.
$ cat /etc/cni/net.d/77-ipvlan.conflist 
{
    "cniVersion": "0.3.0",
    "name": "myvlan",
    "plugins": [
      {
        "type": "ipvlan",
        "master": "enp1s0", #some comments
        "ipam": {
            "type": "host-local",
            "subnet": "10.88.0.0/16",
            "routes": [
                { "dst": "0.0.0.0/0" }
            ]
        }
     }
    ]
}
  1. create container with myvlan network.
$ sudo podman  run --rm -it --network=myvlan docker.io/library/alpine sh
ERRO[0000] CNI network "myvlan" not found               
Error: error configuring network namespace for container baf2fda585db0cc2c874a4bfc00bca3dd27e8402a18ec8e157124240ca887f29: CNI network "myvlan" not found
  1. To understand complete details of error, users has to run podman create with --log-level-debug option.
DEBU[0000] overlay test mount indicated that metacopy is not being used
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false
WARN[0000] Error loading CNI config list file /etc/cni/net.d/77-ipvlan.conflist: error parsing configuration list: invalid character '#' looking for beginning of object key string
INFO[0000] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman-bridge.conflist
WARN[0000] Error loading CNI config file /etc/cni/net.d/97-podman-macvlan.conf: error parsing configuration: missing 'type'
INFO[0000] Found CNI network lo (type=loopback) at /etc/cni/net.d/99-loopback.conf

Describe the results you received:

ERRO[0000] CNI network "myvlan" not found               
Error: error configuring network namespace for container baf2fda585db0cc2c874a4bfc00bca3dd27e8402a18ec8e157124240ca887f29: CNI network "myvlan" not found

Describe the results you expected:
A better error message suggesting where error exist.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

$ podman version
Version:            1.3.0-dev
RemoteAPI Version:  1
Go Version:         go1.12.3
Git Commit:         cb2b019d5debadbe29cba59e93130bd8c562771a-dirty
Built:              Fri Apr 12 10:07:18 2019
OS/Arch:            linux/amd64

Output of podman info --debug:

$ podman info --debug                                                                       
debug:
  compiler: gc
  git commit: cb2b019d5debadbe29cba59e93130bd8c562771a-dirty
  go version: go1.12.3
  podman version: 1.3.0-dev
host:
  BuildahVersion: 1.7.2
  Conmon:
    package: Unknown
    path: /usr/libexec/podman/conmon
    version: 'conmon version 1.14.0-dev, commit: f02fc40ed55504247af4fbf09fd8577d315a6c73'
  Distribution:
    distribution: elementary
    version: "5.0"
  MemFree: 6248939520
  MemTotal: 24985653248
  OCIRuntime:
    package: 'cri-o-runc: /usr/bin/runc'
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 2140925952
  SwapTotal: 2147479552
  arch: amd64
  cpus: 8
  hostname: kunal-HP-dev
  kernel: 4.15.0-46-generic
  os: linux
  rootless: true
  uptime: 167h 43m 36.61s (Approximately 6.96 days)
insecure registries:
  registries: []
registries:
  registries:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  ConfigFile: /home/kunal/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: vfs
  GraphOptions: null
  GraphRoot: /home/kunal/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 0
  RunRoot: /run/user/1000/run
  VolumePath: /home/kunal/.local/share/containers/storage/volumes

Additional environment details (AWS, VirtualBox, physical, etc.):

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 12, 2019
@kunalkushwaha
Copy link
Collaborator Author

It may not be clean to show error messages of CNI config as, all CNI configs are loaded during container create.

If we have some additional commands like podman network ls which can list all valid network possible as result of scanning CNI networks.
In case of invalid configurations, showing error message will be easy and not out of context.

This may be also helpful, to understand end user (not familiar with CNI), to see options of network available on system without looking at CNI config path.

@mheon
Copy link
Member

mheon commented Apr 12, 2019

Hm. The CNI plugins load and try to parse the network, fail, and this is only considered a warning? Part of me thinks we should upgrade that to an error, in which case we fail much earlier (when first trying to load the runtime) and more descriptively (an explicit error about parsing a config)

@kunalkushwaha
Copy link
Collaborator Author

The issue with making the parse error as Error and stop container creation would be, failing container create even if that CNI network is not involved in container create command

e.g. with above config file, simply sudo podman run --rm -it - docker.io/library/alpine sh would also fail.

@mheon
Copy link
Member

mheon commented Apr 12, 2019

I don't know if that's a bad thing... It's an invalid configuration file and we can't know what we won't be using it.

@kunalkushwaha
Copy link
Collaborator Author

I don't have very strong opinion on the error. I can create PR for this.

Will listing available network will also be helpful? e.g. podman network ls ?

@mheon
Copy link
Member

mheon commented Apr 15, 2019

If we're going to start adding podman network commands (and I'm certainly not opposed to doing so), network ls is definitely a good starting point.

@kunalkushwaha
Copy link
Collaborator Author

I have created a PR cri-o/ocicni#30 to return error instead of logging warning during InitCNI(). Since it will affect CRI-O and podman , we need to hear CRI-O team views too.

@rhatdan
Copy link
Member

rhatdan commented Aug 5, 2019

@kunalkushwaha Are you still interested in this?

@kunalkushwaha
Copy link
Collaborator Author

@rhatdan Sorry for no progress.. Will followup again with ocicni maintainers.

I still think, CNI library behaviour should be same across implementation, so this will help end users.

@karl-tpio
Copy link

I just got bit by this. I would prefer that podman implement something akin to podman network so that marketing like this alias podman=docker post will actually be true.

root@lab:/etc/cni/net.d# podman network ls
Error: unrecognized command `podman network`
Try 'podman --help' for more information.

versus

https://docs.docker.com/engine/reference/commandline/network_ls/

@mheon
Copy link
Member

mheon commented Aug 27, 2019

Will be in 1.5.2

@ryanj
Copy link

ryanj commented Sep 9, 2019

Just hit this issue with FCoS IoT and podman-1.4.5-dev... was using a systemd unit to manage podman.

I was testing to make sure that systemd could help gracefully handle a reboot cycle, but my /var/lib/cni/networks/podman/last_reserved_ip.0 file ended up pointed to a stale ip address, which was no longer valid (given the new subnet scope, post-reboot)

@github-actions
Copy link

github-actions bot commented Nov 4, 2019

This issue had no activity for 30 days. In the absence of activity or the "do-not-close" label, the issue will be automatically closed within 7 days.

@vrothberg
Copy link
Member

Any updates on this issue?

@rhatdan
Copy link
Member

rhatdan commented Nov 4, 2019

@mheon Is this fixed in 1.5.2?

@mheon
Copy link
Member

mheon commented Nov 4, 2019

No. That's podman network create. CNI error messages are still not landed.

baude added a commit to baude/podman that referenced this issue Jan 28, 2020
if one of the cni conf files is badly formatted or cannot be loaded, we now display the error as well as the filename.

Fixes: containers#2909
Signed-off-by: Brent Baude <[email protected]>
@baude
Copy link
Member

baude commented Jan 28, 2020

I believe #5008 will simplify learning that you have a bad cni conf file. For example, i purposely made a bad conf file named foo:

$ sudo podman run --network foo --rm -it alpine ls
ERRO[0000] CNI network "foo" not found                  
Error: error configuring network namespace for container 007f4d86f280dc2d13daf630ea78ef8a29bb92d9bc4f4140749a13e62d734e1f: CNI network "foo" not found
$ sudo podman network ls
Error: in /etc/cni/net.d/foo.conflist: error parsing configuration list: invalid character '{' looking for beginning of object key string
$ sudo podman network inspect foo
Error: in /etc/cni/net.d/foo.conflist: error parsing configuration list: invalid character '{' looking for beginning of object key string

@baude baude self-assigned this Jan 28, 2020
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
do-not-close kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants