diff --git a/src/markdown-pages/add-ons/rook.md b/src/markdown-pages/add-ons/rook.md index 0e8132ae..9ecc8478 100644 --- a/src/markdown-pages/add-ons/rook.md +++ b/src/markdown-pages/add-ons/rook.md @@ -10,12 +10,6 @@ addOn: "rook" The [Rook](https://rook.io/) add-on creates and manages a Ceph cluster along with a storage class for provisioning PVCs. It also runs the Ceph RGW object store to provide an S3-compatible store in the cluster. -By default the cluster uses the filesystem for storage. Each node in the cluster will have a single OSD backed by a directory in `/opt/replicated/rook`. Nodes with a Ceph Monitor also utilize `/var/lib/rook`. - -**Note**: At minimum, 10GB of disk space should be available to `/var/lib/rook` for the Ceph Monitors and other configs. We recommend a separate partition to prevent a disruption in Ceph's operation as a result of `/var` or the root partition running out of space. - -**Note**: All disks used for storage in the cluster should be of similar size. A cluster with large discrepancies in disk size may fail to replicate data to all available nodes. - The [EKCO](/docs/add-ons/ekco) add-on is recommended when installing Rook. EKCO is responsible for performing various operations to maintain the health of a Ceph cluster. ## Advanced Install Options @@ -36,11 +30,16 @@ flags-table ## Block Storage -For production clusters, Rook should be configured to use block devices rather than the filesystem. -Enabling block storage is required with version 1.4.3+. Therefore, the `isBlockStorageEnabled` option will always be set to true when using version 1.4.3+. -The following spec enables block storage for the Rook add-on and automatically uses disks matching the regex `/sd[b-z]/`. -Rook will start an OSD for each discovered disk, which could result in multiple OSDs running on a single node. -Rook will ignore block devices that already have a filesystem on them. +For Rook versions 1.4.3 and later, block storage is required. +For Rook versions earlier than 1.4.3, block storage is recommended in production clusters. + +You can enable and disable block storage for Rook versions earlier than 1.4.3 with the `isBlockStorageEnabled` field in the kURL spec. + +When the `isBlockStorageEnabled` field is set to `true`, or when using Rook versions 1.4.3 and later, Rook starts an OSD for each discovered disk. +This can result in multiple OSDs running on a single node. +Rook ignores block devices that already have a filesystem on them. + +The following provides an example of a kURL spec with block storage enabled for Rook: ```yaml spec: @@ -50,8 +49,27 @@ spec: blockDeviceFilter: sd[b-z] ``` -The Rook add-on will wait for a disk before continuing. -If you have attached a disk to your node but the installer is still waiting at the Rook add-on installation step, refer to the [troubleshooting guide](https://rook.io/docs/rook/v1.0/ceph-common-issues.html#osd-pods-are-not-created-on-my-devices) for help with diagnosing and fixing common issues. +In the example above, the `isBlockStorageEnabled` field is set to `true`. +Additionally, `blockDeviceFilter` instructs Rook to use only block devices that match the specified regex. +For more information about the available options, see [Advanced Install Options](#advanced-install-options) above. + +The Rook add-on waits for a disk before continuing with installation. +If you attached a disk to your node, but the installer is waiting at the Rook add-on installation step, see [OSD pods are not created on my devices](https://rook.io/docs/rook/v1.0/ceph-common-issues.html#osd-pods-are-not-created-on-my-devices) in the Rook documentation for troubleshooting information. + +## Filesystem Storage + +By default, for Rook versions earlier than 1.4.3, the cluster uses the filesystem for Rook storage. +However, block storage is recommended for Rook in production clusters. +For more information, see [Block Storage](#block-storage) above. + +When using the filesystem for storage, each node in the cluster has a single OSD backed by a directory in `/opt/replicated/rook`. +Nodes with a Ceph Monitor also use `/var/lib/rook`. + +At minimum, 10GB of disk space must be available to `/var/lib/rook` for the Ceph Monitors and other configs. +We recommend a separate partition to prevent a disruption in Ceph's operation as a result of `/var` or the root partition running out of space. + +**Note**: All disks used for storage in the cluster should be of similar size. +A cluster with large discrepancies in disk size may fail to replicate data to all available nodes. ## Shared Filesystem diff --git a/src/markdown-pages/install-with-kurl/system-requirements.md b/src/markdown-pages/install-with-kurl/system-requirements.md index 1b038c21..9dcebf2b 100644 --- a/src/markdown-pages/install-with-kurl/system-requirements.md +++ b/src/markdown-pages/install-with-kurl/system-requirements.md @@ -22,8 +22,9 @@ title: "System Requirements" * 4 AMD64 CPUs or equivalent per machine * 8 GB of RAM per machine -* 40 GB of Disk Space per machine. - * **Note**: When [Rook](/docs/add-ons/rook) is enabled, 10GB of the total 40GB should be available to `/var/lib/rook` +* 40 GB of Disk Space per machine +* The Rook add-on version 1.4.3 and later requires block storage on each node in the cluster. + For more information about how to enable block storage for Rook, see [Block Storage](/docs/add-ons/rook#block-storage) in _Rook Add-On_. * TCP ports 2379, 2380, 6443, 10250, 10251 and 10252 open between cluster nodes * **Note**: When [Flannel](/docs/add-ons/flannel) is enabled, UDP port 8472 open between cluster nodes * **Note**: When [Weave](/docs/add-ons/weave) is enabled, TCP port 6783 and UDP port 6783 and 6784 open between cluster nodes @@ -38,7 +39,7 @@ For more information see [kURL Advanced Install Options](/docs/install-with-kurl ## Networking Requirements ### Firewall Openings for Online Installations -The following domains need to be accessible from servers performing online kURL installs. +The following domains need to be accessible from servers performing online kURL installs. IP addresses for these services can be found in [replicatedhq/ips](https://github.com/replicatedhq/ips/blob/master/ip_addresses.json). | Host | Description | @@ -50,9 +51,9 @@ IP addresses for these services can be found in [replicatedhq/ips](https://githu No outbound internet access is required for airgapped installations. ### Host Firewall Rules -The kURL install script will prompt to disable firewalld. +The kURL install script will prompt to disable firewalld. Note that firewall rules can affect communications between containers on the **same** machine, so it is recommended to disable these rules entirely for Kubernetes. -Firewall rules can be added after or preserved during an install, but because installation parameters like pod and service CIDRs can vary based on local networking conditions, there is no general guidance available on default requirements. +Firewall rules can be added after or preserved during an install, but because installation parameters like pod and service CIDRs can vary based on local networking conditions, there is no general guidance available on default requirements. See [Advanced Options](/docs/install-with-kurl/advanced-options) for installer flags that can preserve these rules. The following ports must be open between nodes for multi-node clusters: @@ -103,15 +104,15 @@ In addition to the networking requirements described in the previous section, op ### Control Plane HA -To operate the Kubernetes control plane in HA mode, it is recommended to have a minimum of 3 primary nodes. -In the event that one of these nodes becomes unavailable, the remaining two will still be able to function with an etcd quorom. +To operate the Kubernetes control plane in HA mode, it is recommended to have a minimum of 3 primary nodes. +In the event that one of these nodes becomes unavailable, the remaining two will still be able to function with an etcd quorom. As the cluster scales, dedicating these primary nodes to control-plane only workloads using the `noSchedule` taint should be considered. This will affect the number of nodes that need to be provisioned. ### Worker Node HA The number of required secondary nodes is primarily a function of the desired application availability and throughput. -By default, primary nodes in kURL also run application workloads. +By default, primary nodes in kURL also run application workloads. At least 2 nodes should be used for data durability for applications that use persistent storage (i.e. databases) deployed in-cluster. ### Load Balancers @@ -125,7 +126,7 @@ graph TB A -->|Port 6443| D[Primary Node] ``` -Highly available cluster setups that do not leverage EKCO's [internal load balancing capability](/docs/add-ons/ekco#internal-load-balancer) require a load balancer to route requests to healthy nodes. +Highly available cluster setups that do not leverage EKCO's [internal load balancing capability](/docs/add-ons/ekco#internal-load-balancer) require a load balancer to route requests to healthy nodes. The following requirements need to be met for load balancers used on the control plane (primary nodes): 1. The load balancer must be able to route TCP traffic, as opposed to Layer 7/HTTP traffic. 1. The load balancer must support hairpinning, i.e. nodes referring to eachother through the load balancer IP. @@ -134,7 +135,7 @@ The following requirements need to be met for load balancers used on the control 1. The load balancer should target each primary node on port 6443. 1. In accordance with the above firewall rules, port 6443 should be open on each primary node. -The IP or DNS name and port of the load balancer should be provided as an argument to kURL during the HA setup. +The IP or DNS name and port of the load balancer should be provided as an argument to kURL during the HA setup. See [Highly Available K8s](/docs/install-with-kurl/#highly-available-k8s-ha) for more install information. For more information on configuring load balancers in the public cloud for kURL installs see [Public Cloud Load Balancing](/docs/install-with-kurl/public-cloud-load-balancing).