Skip to content

Commit

Permalink
Add troubleshooting advice when running Quick Start with CAPD
Browse files Browse the repository at this point in the history
  • Loading branch information
oscr committed Jul 20, 2022
1 parent 2fc8bcd commit 690f501
Show file tree
Hide file tree
Showing 4 changed files with 37 additions and 3 deletions.
2 changes: 1 addition & 1 deletion docs/book/src/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Started by the Kubernetes Special Interest Group (SIG) [Cluster Lifecycle](https

## Getting started

* [Quick start](./user/quick-start.md)
* [Quick Start](./user/quick-start.md)
* [Concepts](./user/concepts.md)
* [Developer guide](./developer/guide.md)
* [Contributing](./CONTRIBUTING.md)
Expand Down
2 changes: 1 addition & 1 deletion docs/book/src/tasks/certs/using-custom-certificates.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Each certificate must be stored in a single secret named one of:
| *[cluster name]***-sa** | Key Pair | openssl genrsa -out tls.key 2048 && openssl rsa -in tls.key -pubout -out tls.crt |


<aside class="note warn">
<aside class="note warning">

<h1>CA Key Age</h1>

Expand Down
6 changes: 5 additions & 1 deletion docs/book/src/user/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,11 @@ a target [management cluster] on the selected [infrastructure provider].

**Minimum [kind] supported version**: v0.14.0

Note for macOS users: you may need to [increase the memory available](https://docs.docker.com/docker-for-mac/#resources) for containers (recommend 6Gb for CAPD).
**Help with common issues can be found in the [Troubleshooting Guide](./troubleshooting.md).**

Note for macOS users: you may need to [increase the memory available](https://docs.docker.com/docker-for-mac/#resources) for containers (recommend 6 GB for CAPD).

Note for Linux users: you may need to [increase `ulimit` and `inotify` when using Docker (CAPD)](./troubleshooting.md#cluster-api-with-docker----too-many-open-files).

</aside>

Expand Down
30 changes: 30 additions & 0 deletions docs/book/src/user/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,36 @@ provisioning might be stuck:
* Run [docker system prune --volumes](https://docs.docker.com/engine/reference/commandline/system_prune/) to prune dangling images, containers, volumes and networks.


## Machines fail to start

In the example below we can see that the Machine `capi-quickstart-6587k-xtvnz` has failed to start. The reason provided is `BootstrapFailed`.

```shell
clusterctl describe cluster capi-quickstart
NAME READY SEVERITY REASON SINCE MESSAGE
Cluster/capi-quickstart False Warning ScalingUp 57s Scaling up control plane to 3 replicas (actual 2)
├─ClusterInfrastructure - DockerCluster/capi-quickstart-n5w87 True 110s
├─ControlPlane - KubeadmControlPlane/capi-quickstart-6587k False Warning ScalingUp 57s Scaling up control plane to 3 replicas (actual 2)
│ ├─Machine/capi-quickstart-6587k-fgc6m True 81s
│ └─Machine/capi-quickstart-6587k-xtvnz False Warning BootstrapFailed 52s 1 of 2 completed
└─Workers
└─MachineDeployment/capi-quickstart-md-0-5whtj False Warning WaitingForAvailableMachines 110s Minimum availability requires 3 replicas, current 0 available
└─3 Machines... False Info Bootstrapping 77s See capi-quickstart-md-0-5whtj-5d8c9746c9-f8sw8, capi-quickstart-md-0-5whtj-5d8c9746c9-hzxc2, ...
```

To investigate why a machine fails to start you can inspect the logs using `docker logs <MACHINE-NAME>`. For example:

```shell
docker logs capi-quickstart-6587k-xtvnz
(...)
Failed to create control group inotify object: Too many open files
Failed to allocate manager object: Too many open files
[!!!!!!] Failed to allocate manager object.
Exiting PID 1...
```

To resolve this specfic error please read [Cluster API with Docker - "too many open files"](#cluster-api-with-docker----too-many-open-files).

## Cluster API with Docker - "too many open files"
When creating many nodes using Cluster API and Docker infrastructure, either by creating large Clusters or a number of small Clusters, the OS may run into inotify limits which prevent new nodes from being provisioned.
If the error `Failed to create inotify object: Too many open files` is present in the logs of the Docker Infrastructure provider this limit is being hit.
Expand Down

0 comments on commit 690f501

Please sign in to comment.