Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] If the memory config is provided to a cluster, then adding new nodes does not work #893

Closed
toddnni opened this issue Dec 14, 2021 · 1 comment
Assignees
Labels
bug Something isn't working DONE Issue solved, but not closed yet, due to pending release
Milestone

Comments

@toddnni
Copy link

toddnni commented Dec 14, 2021

What did you do

  • How was the cluster created?

    • k3d cluster create --servers-memory 2048MiB
  • What did you do afterwards?

    • see the cluster
    % k3d cluster list 
    NAME          SERVERS   AGENTS   LOADBALANCER
    k3s-default   1/1       0/0      true
    
    • try to add a node
    % k3d node create failing
    INFO[0000] Adding 1 node(s) to the runtime local cluster 'k3s-default'... 
    INFO[0000] Using the k3d-tools node to gather environment information 
    INFO[0000] Starting new tools node...                   
    INFO[0000] Starting Node 'k3d-k3s-default-tools'        
    INFO[0000] HostIP: using network gateway 172.24.0.1 address 
    FATA[0001] failed to add 1 node(s) to the runtime local cluster 'k3s-default': failed to add one or more nodes: failed to run node 'k3d-failing-0': failed to create node 'k3d-failing-0': runtime failed to create node 'k3d-failing-0': failed to create container for node 'k3d-failing-0': docker failed to create container 'k3d-failing-0': Error response from daemon: Duplicate mount point: /proc/meminfo 
    

What did you expect to happen

The node should be added to the existing cluster.

The error seems to happen in the following cases

  1. When creating a cluster, you provide server memory limits, and do not create agents, and try to create a new node
  2. When creating a cluster, you create agents, but also provide memory limits for them
  3. Create agents with memory limits and try to re-create more agents

The first case is described in the beginning, and see examples for the two latter ones below.

However, the error does not happen if you

a. Create cluster, with server memory limit, but agent without memory limits.
b. Create a cluster without any memory limits (the default use case)

This seems to be related to /proc/meminfo volume cleanup that does not trigger properly https://github.com/rancher/k3d/blob/main/pkg/client/node.go#L118. There are no log lines "Dropping copied volume mount ..." in the trace log.

Screenshots or terminal output

Working case

% k3d cluster create      
INFO[0000] Prep: Network  
INFO[0000] Created network 'k3d-k3s-default'  
INFO[0000] Created volume 'k3d-k3s-default-images'  
INFO[0000] Starting new tools node...  
INFO[0000] Starting Node 'k3d-k3s-default-tools'  
INFO[0001] Creating node 'k3d-k3s-default-server-0'  
INFO[0001] Creating LoadBalancer 'k3d-k3s-default-serverlb' 
INFO[0001] Using the k3d-tools node to gather environment information 
INFO[0001] HostIP: using network gateway 172.25.0.1 address 
INFO[0001] Starting cluster 'k3s-default'  
INFO[0001] Starting servers...  
INFO[0001] Starting Node 'k3d-k3s-default-server-0'  
INFO[0006] All agents already running.  
INFO[0006] Starting helpers...  
INFO[0006] Starting Node 'k3d-k3s-default-serverlb'  
INFO[0013] Injecting '172.25.0.1 host.k3d.internal' into /etc/hosts of all nodes... 
INFO[0013] Injecting records for host.k3d.internal and for 2 network members into CoreDNS configmap... 
INFO[0014] Cluster 'k3s-default' created successfully!  
INFO[0014] You can now use it like this:  
kubectl cluster-info
% k3d node create works
INFO[0000] Adding 1 node(s) to the runtime local cluster 'k3s-default'... 
INFO[0000] Using the k3d-tools node to gather environment information 
INFO[0000] Starting new tools node...  
INFO[0000] Starting Node 'k3d-k3s-default-tools'  
INFO[0000] HostIP: using network gateway 172.25.0.1 address 
INFO[0000] Starting Node 'k3d-works-0'  
INFO[0009] Successfully created 1 node(s)!

But if we then add one node with memory limit and try to add again (3. error case)

% k3d node create works2 --memory 1024MiB
INFO[0000] Adding 1 node(s) to the runtime local cluster 'k3s-default'... 
INFO[0000] Using the k3d-tools node to gather environment information 
INFO[0000] Starting new tools node...  
INFO[0000] Starting Node 'k3d-k3s-default-tools'  
INFO[0000] HostIP: using network gateway 172.25.0.1 address 
INFO[0001] Starting Node 'k3d-works2-0'  
INFO[0009] Successfully created 1 node(s)!  
% k3d node create fails                    
INFO[0000] Adding 1 node(s) to the runtime local cluster 'k3s-default'... 
INFO[0000] Using the k3d-tools node to gather environment information 
INFO[0000] Starting new tools node...  
INFO[0000] Starting Node 'k3d-k3s-default-tools'  
INFO[0000] HostIP: using network gateway 172.25.0.1 address 
FATA[0001] failed to add 1 node(s) to the runtime local cluster 'k3s-default': failed to add one or more nodes: failed to run node 'k3d-fails-0': failed to create node 'k3d-fails-0': runtime failed to create node 'k3d-fails-0': failed to create container for node 'k3d-fails-0': docker failed to create container 'k3d-fails-0': Error response from daemon: Duplicate mount point: /proc/meminfo 

Or direct failure, if we create a cluster with agent memory limit (2. error case)

% k3d cluster create --servers-memory=2048MiB --agents=1 --agents-memory=1024MiB
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-k3s-default'
INFO[0000] Created volume 'k3d-k3s-default-images'
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-k3s-default-tools'
INFO[0001] Creating node 'k3d-k3s-default-server-0'
INFO[0001] Creating node 'k3d-k3s-default-agent-0'
INFO[0002] Creating LoadBalancer 'k3d-k3s-default-serverlb'
INFO[0002] Using the k3d-tools node to gather environment information
INFO[0002] HostIP: using network gateway 172.27.0.1 address
INFO[0002] Starting cluster 'k3s-default'
INFO[0002] Starting servers...
INFO[0002] Starting Node 'k3d-k3s-default-server-0'
INFO[0007] Starting agents...
INFO[0007] Starting Node 'k3d-k3s-default-agent-0'
INFO[0020] Starting helpers...
INFO[0020] Starting Node 'k3d-k3s-default-serverlb'
INFO[0027] Injecting '172.27.0.1 host.k3d.internal' into /etc/hosts of all nodes...
INFO[0027] Injecting records for host.k3d.internal and for 3 network members into CoreDNS configmap...
INFO[0028] Cluster 'k3s-default' created successfully!
INFO[0028] You can now use it like this:
kubectl cluster-info
% k3d node create fail1
INFO[0000] Adding 1 node(s) to the runtime local cluster 'k3s-default'...
INFO[0000] Using the k3d-tools node to gather environment information
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-k3s-default-tools'
INFO[0000] HostIP: using network gateway 172.27.0.1 address
FATA[0001] failed to add 1 node(s) to the runtime local cluster 'k3s-default': failed to add one or more nodes: failed to run node 'k3d-fail1-0': failed to create node 'k3d-fail1-0': runtime failed to create node 'k3d-fail1-0': failed to create container for node 'k3d-fail1-0': docker failed to create container 'k3d-fail1-0': Error response from daemon: Duplicate mount point: /proc/meminfo
% k3d node create fail2 --memory=1024MiB
INFO[0000] Adding 1 node(s) to the runtime local cluster 'k3s-default'...
INFO[0000] Using the k3d-tools node to gather environment information
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-k3s-default-tools'
INFO[0000] HostIP: using network gateway 172.27.0.1 address
FATA[0001] failed to add 1 node(s) to the runtime local cluster 'k3s-default': failed to add one or more nodes: failed to run node 'k3d-fail2-0': failed to create node 'k3d-fail2-0': runtime failed to create node 'k3d-fail2-0': failed to create container for node 'k3d-fail2-0': docker failed to create container 'k3d-fail2-0': Error response from daemon: Duplicate mount point: /proc/meminfo

Which OS & Architecture

  • Linux / amd64
  • Ubuntu 20.04

Which version of k3d

% k3d version
k3d version v5.2.1
k3s version v1.21.7-k3s1 (default)

Which version of docker

% docker version
Client:
 Version:           20.10.7
 ...
@toddnni toddnni added the bug Something isn't working label Dec 14, 2021
@iwilltry42 iwilltry42 added this to the v5.2.3 milestone Dec 17, 2021
@iwilltry42 iwilltry42 self-assigned this Dec 17, 2021
@iwilltry42
Copy link
Member

Hi @toddnni , thanks for this amazingly well done bug report :)
It will be fixed in the next patch release v5.2.3 👍

@iwilltry42 iwilltry42 added the DONE Issue solved, but not closed yet, due to pending release label Dec 17, 2021
@iwilltry42 iwilltry42 modified the milestones: v5.2.3, v5.3.0 Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working DONE Issue solved, but not closed yet, due to pending release
Projects
None yet
Development

No branches or pull requests

2 participants