Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CoreDHCP (and Tftp) Container(s) to Quickstart #78

Merged
merged 19 commits into from
Nov 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
238 changes: 238 additions & 0 deletions quickstart/DHCP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
# Deploying CoreDHCP on a "Real" System

<!-- Text width is 80, only use spaces and use 4 spaces instead of tabs -->
<!-- vim: set et sta tw=80 ts=4 sw=4 sts=0: -->

The quickstart rather hastily sets up CoreDHCP with assumed network parameters,
but it would be useful to know how to configure it for an actual system. This
document serves to walk through how to configure CoreDHCP with the
[coresmd](https://github.com/OpenCHAMI/coresmd) plugins.

## Purpose

SMD is meant to be the source of truth for nodes/BMCs in the cluster, so the
goal of CoreDHCP + coresmd is to match MAC addresses requesting an IP address to
interfaces stored in SMD and serve the matching IP address. However, for
unknown MAC addresses to become known to SMD, they need to be added, for
example, by network discovery tools like
[Magellan](https://github.com/OpenCHAMI/magellan). To be discoverable at the
network layer, coresmd provides functionality for providing unknown MAC
addresses with temporary IP addresses so they can be discovered. Once they are
discovered and added to SMD, they can get a more "permanent" IP address from
coresmd.

## Methodology

Coresmd differentiates between *known* MAC addresses (handled by the `coresmd`
plugin itself) and *unknown* MAC addresses (handled by the `bootloop` plugin or
by CoreDHCP's `file` plugin depending on if IP-MAC mapping is necessary). The
general flow for a device getting a long-term IP address from scratch is as
follows:

1. Unknown MAC gets assigned an IP with a short lease.
- This can be an available IP from a pool (`bootloop` plugin) or a fixed IP
(`file` plugin).
- If left unknown, device will continually request a new IP and get a
short-lived one until the MAC becomes known.
1. MAC with short-leased IP gets added to SMD.
- This happens outside the scope of DHCP.
- How this happens can depend on the device type:
- **BMC:** Using Magellan. **NOTE:** Modern versions of Magellan add node
interfaces to SMD if discovered via Redfish.
- **Node:** POSTing to SMD using `curl` or the Ochami CLI tool.
1. Known MAC gets assigned the IP assigned to it in SMD once the short-leased IP
address expires, but with a longer lease time.
- This happens via the `coresmd` plugin itself.

The first step in the above can be handled by either coresmd's `bootloop` plugin
or CoreDHCP's `file` plugin, or via a combination of both. The next two sections
describe the uses for these plugins and how they work while the section after
describes how the `coresmd` plugin itself works.

### Unknown MAC Addresses: The `file` Plugin

This plugin is used when it *does* matter which MAC address gets which IP
address. It is paired with CoreDHCP's `lease_time` directive to set how long the
temporary IPs should last. This plugin is maintained by CoreDHCP.

The `file` plugin is pretty simple: it hands out the IP address assigned to the
MAC address sending the DHCPDISCOVER and renews this IP once it expires.

### Unknown MAC Addresses: The `bootloop` Plugin

This plugin is used when it does not matter which MAC address gets which IP
address. Often, this is as a catch-all for MAC addresses not in an assignment
list, e.g. MAC addresses not caught by the `file` plugin above.

As stated, the `bootloop` plugin is designed to assign available IPs from a pool
to unknown MAC addresses without the guarantee that specific IP addresses get
assigned to certain MAC addresses. While this plugin works with any device that
speaks DHCP, important behavioral differences are present between devices that
are able to network boot (e.g. ethernet interfaces on a node) and devices that
are not able to network boot (e.g. BMCs). The difference is how requests to
renew IP addresses are handled. When devices that can boot try to renew their
IP address, they are served an iPXE script that reboots them so they are forced
to renew their IP address. When devices that cannot boot try to renew their IP
address, their request is responded to with a DHCPNAK, which, according to [RFC
2131](https://datatracker.ietf.org/doc/html/rfc2131#section-3.2), causes the
device to reinitiate the entire DHCP handshake.

Technically, all DHCPDISCOVERs from MAC addresses that haven't been assigned an
IP address are responded to with a DHCPOFFER with the temporary IP address and
the rebooting iPXE script. Devices that can boot execute this iPXE script while
devices that cannot do not. So, when a non-booting device tries to renew this IP
address with a DHCPREQUEST, the response is a DHCPNAK so that it will send a
DHCPDISCOVER.

### Known MAC Addresses: The `coresmd` Plugin

This plugin is used to assign IP addresses based on data in SMD.

A cache in memory is maintained containing SMD Component and EthernetInterface
data which is refreshed at a configured interval. This refreshment occurs via a
separate thread (goroutine).

When a DHCP request reaches the plugin, it checks the cache if 1) the MAC
address exists as an EthernetInterface, 2) if there is an IP address for this
interface, and 3) if there is a corresponding Component for this interface. If
all three exist, the IP address corresponding to the EthernetInterface structure
is assigned to the device. This could be a node NIC or a BMC.

## Preparation

### (REQUIRED) TFTP

Since CoreDHCP does not include a TFTP server or plugin (as far as is known at
this writing), one is required that contains the following files at the TFTP
root:

- **ipxe.efi** --- UEFI iPXE bootloader for amd64 systems
- **undionly.kpxe** --- Legacy bootloader for x86-based systems
- **reboot.ipxe** --- The reboot iPXE script which contains:
```ipxe
#!ipxe
reboot
```

### (OPTIONAL) File for `file` Plugin

If using the `file` plugin, you will need a plaintext file that contains the
MAC-to-IP mapping. For example:

```
de:ca:fc:0f:fe:ee 172.16.0.101
de:ad:be:ee:ee:ef 172.16.0.102
```

## Writing a Configuration File

The configuration file is YAML-formatted. The general format is:

```yaml
server4:
plugins:
- plugin1: arg1 arg2
- plugin2: arg1 arg2
...
```

... where `plugin1` and `plugin2` are plugin names in the plugin list, each
followed by space-separated arguments.

### Part 1: Server Configuration

The first part of this file should be plugins that configure basic server
settings, such as the IP of the DHCP server and optional DNS servers. These
settings should be configured *before* the coresmd configuration, since CoreDHCP
sends DHCP packets to be processed sequentially, *in order*, through these
plugins.

Let's look at an example server configuration:

```yaml
server4:
plugins:
- server_id: 172.16.0.253
- dns: 1.1.1.1,8.8.8.8
- router: 172.16.0.254
- netmask: 255.255.255.0
```

- **server_id:** (*REQUIRED*) This is the "identity" of the DHCP server to
distinguish it from any other servers that might be listening on the same
network. Usually this is just the IP address the server is listening on.
- **dns:** (*OPTIONAL*) A comma-separated list of DNS servers to use for names
and domains.
- **router:** (*REQUIRED*) The IP address of the network gateway for routing
packets. This can be the same as the IP address CoreDHCP is listening on if
that machine acts as a gateway.
- **netmask:** (*OPTIONAL*) The network mask used with IP addresses served by
the `file` and `bootloop` plugins, if used. This is not needed if one is
*only* using the `coresmd` plugin.

### Part 2: CoreSMD Configuration

The next part of the configuration file corresponds to the place where any of
the coresmd/file/bootloop plugins are configured. These need to be *below* the
server config above.

```yaml
server4:
plugins:
...
- coresmd: https://foobar.openchami.cluster http://172.16.0.253:8081 /root_ca/root_ca.crt 30s 1h
- lease_time: 10m
- file: /etc/coredhcp/hostsfile
- bootloop: /tmp/coredhcp.db 5m 172.16.0.156 172.16.0.200
```

- **coresmd:** (*REQUIRED*) Check if MAC address in request matches any
component in SMD. Pass request through if not.

Arguments:
- **SMD Base URI:** (*https://foobar.openchami.cluster*) Base URI for where
SMD is listening (usually behind API proxy), usually with TLS enabled.
- **Boot Script Base URI:** (*http://172.16.0.253:8081*) Base URI for where
BSS is listening to fetch boot scripts from, usually *without* TLS. This is
a separate argument because chances are that the CA certificate is not baked
into the iPXE bootloader and thus cannot perform proper certificate
validation.
- **Path to CA Certifacate:** (*/root_ca/root_ca.crt*) Path to certificate
authority certificate for validation of connections to SMD.
- **Cache Update Interval:** (*30s*) Amount of time in between cache
refreshes.[^intervals]
- **Known Device Lease Duration:** (*1h*) Amount of time a *known* device's IP
is valid for.[^intervals]
- **lease_time:** (*OPTIONAL*) Assign lease time for *unknown* nodes. This is
*required* if using the `file` or `bootloop` plugins.

Arguments:
- **Unknown Device Lease Duration:** (*10m*) Amount of time an *unknown*
device's IP is valid for.[^intervals]
- **file:** (*OPTIONAL*) Assign specific IP addresses to specific MAC addresses
based on mapping in file. Typically, this comes right after `coresmd` since
some MACs that are unknown to SMD need to be assigned a specific IP address.
If the MAC isn't in the list, it gets passed to the "catch-all" `bootloop`
plugin below.

Arguments:
- **Map File Path:** (*/etc/coredhcp/hostsfile*) Path to text file that maps
MAC addresses to IP addresses.
- **bootloop:** (*OPTIONAL*) Assign available IP addresses from a pool to
unknown MAC addresses. This is normally the last plugin in the file because it
is usually used as a catch-all: the MAC was not known by SMD and was not
listed in the map file.

Arguments:
- **Storage DB Path:** (*/tmp/coredhcp.db*) Path to sqlite3 file used for
storing IP addresses that have been assigned. This file need not exist and
will be created by the plugin upon initialization (assuming permissions are
correct!).
- **Pool Start IP:** (*172.16.0.156*) Starting IP address (inclusive) of the
pool of available IP addresses to hand out.
- **Pool End IP:** (*172.16.0.200*) Ending IP address (inclusive) of the pool
of available IP addresses to hand out.

[^intervals]: Interval strings are parsed via Go's
[time.ParseDuration](https://pkg.go.dev/time#ParseDuration) function. Check
there for valid strings.
32 changes: 15 additions & 17 deletions quickstart/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,17 @@ This quickstart makes a few assumptions about the target operating system and is
* x86_64 - Some of the containers involved are built and tested for alternative operating systems and architectures, but the solution as a whole is only tested with x86 containers
* Dedicated System - The docker compose setup assumes that it can take control of several TCP ports and interact with the host network for DHCP and TFTP. It is tested on a dedicated virtual machine
* Local Name Registration - The quickstart bootstraps a Certificate Authority and issues an SSL certificate with a predictable name. For access, you will need to add that name/IP to /etc/hosts on all clients or make it resolvable through your site DNS
* DHCP Network Configuration:
* Server/Gateway IP Address: __192.168.0.254__
* Range of IPs for Unknown MAC Addresses: __192.168.150__ to __192.168.0.253__ (inclusive)
* These IPs are given to MAC addresses unknown to SMD with a short least time.
* Network Mask: __255.255.255.0__
* DNS Servers: __1.1.1.1, 8.8.8.8__
* Duration to renew information from SMD: __30 seconds__
* "Long" lease time: __1 hour__
* This duration is used in leases for devices known by SMD.
* "Short" lease time: __5 minutes__
* This duration is used in leases for devices unknown by SMD.

## Start Here

Expand All @@ -22,25 +33,20 @@ This quickstart makes a few assumptions about the target operating system and is
```
1. Create the secrets file and choose a name for your system. We use `foobar` in our example.
- __Note__ The certificates for the system use the name you provide in this file. It's not easy to change.
- __Note__ The script attempts to figure out which ip address is most likely to be your system ip. If it is unsuccessful, `LOCAL_IP=` will be empty and you'll need to update it manually
- __Note__ The full url will be https://foobar.openchami.cluster which you should set manually in /etc/hosts and point to the same ip address as `LOCAL_IP` in `.env`.
- __Note__ The `generate-configs.sh` script accepts an optional second argument that allows a custom domain to be specified. If not specified, it defaults to "openchami.cluster", which is used throughout these instructions.
- __Note__ The full url will be https://foobar.openchami.cluster which you should set manually in /etc/hosts and point to the same ip address as `LOCAL_IP` in `.env`, which should be 192.168.0.254.

```bash
# Create the secrets in the .env file. Do not share them with anyone.
./generate-configs.sh foobar
# Confirm that LOCAL_IP has a value and matches what you want the interface to OpenCHAMI to be. We do our best to guess what your primary interface is.
grep LOCAL_IP .env
./generate-configs.sh
```
If you have problems with this step, check to make sure that the main IP address of your host is in `.env` as `LOCAL_IP`.
1. Update your /etc/hosts to point your system name to your local ip (this is important for valid certs)
1. Update your /etc/hosts to point `foobar.openchami.cluster` to 192.168.0.254 (this is important for valid certs).
1. Start the main services
```bash
docker compose -f base.yml -f postgres.yml -f jwt-security.yml -f haproxy-api-gateway.yml -f openchami-svcs.yml -f autocert.yml up -d
docker compose -f base.yml -f postgres.yml -f jwt-security.yml -f haproxy-api-gateway.yml -f openchami-svcs.yml -f autocert.yml -f tftp.yml -f coredhcp.yml up -d
```
__If this step produces an error like: `Error response from daemon: invalid IP address in add-host: ""` it means you're missing the LOCAL_IP in step 2.__
You can fix it by destroying everything, editing `.env` manually and starting over. The command to destroy is the same as the command to create, just replace `up -d` with `down --volumes`

1. Use the running system to download your certs and create your access token(s)
```bash
# Assuming you're using bash as your shell, you can use the included functions to simplify interactions with your new OpenCHAMI system.
Expand All @@ -54,14 +60,6 @@ This quickstart makes a few assumptions about the target operating system and is
curl --cacert cacert.pem -H "Authorization: Bearer $ACCESS_TOKEN" https://foobar.openchami.cluster/hsm/v2/State/Components
# This should respond with an empty set of Components: {"Components":[]}
```
1. Create a token that can be used by the dnsmasq-loader which reads from smd. This activates our automatic dns/dhcp system. The command automatically adds it to .env
```bash
echo "DNSMASQ_ACCESS_TOKEN=$(gen_access_token)" >> .env
```
1. Use docker-compose to bring up your dnsmasq contianers. The only difference between this command and the one above is the addition of the `dnsmasq.yml` file. Docker compose needs to know about all the files to follow dependencies.
```bash
docker compose -f base.yml -f postgres.yml -f jwt-security.yml -f haproxy-api-gateway.yml -f openchami-svcs.yml -f autocert.yml -f dnsmasq.yml up -d
```


## cloud-init Server Setup
Expand Down
1 change: 1 addition & 0 deletions quickstart/configs/.gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
opaal.yaml
coredhcp.yaml
25 changes: 25 additions & 0 deletions quickstart/configs/coredhcp-template.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
server4:
plugins:
#
# Base CoreDHCP config
#

- server_id: 192.168.0.254
- dns: 1.1.1.1 8.8.8.8
- router: 172.16.0.254
- netmask: 255.255.255.0

#
# CoreSMD config
#

# Args: ochami_base_url boot_script_base_url ca_cert_path cache_update_interval long_lease_time
- coresmd: <BASE_URL> http://192.168.0.254:8081 /root_ca/root_ca.crt 30s 1h

# Optionally include the file plugin here if it matters which IPs get assigned to which
# MACs. Otherwise, unknown MACs get passed to the bootloop "catch-all" plugin below.
#
#- file /etc/coredhcp/hostsfile

# Args: storage_path short_lease_time ip_pool_start ip_pool_end
- bootloop: /tmp/coredhcp.db 5m 192.168.0.150 192.168.0.253
22 changes: 22 additions & 0 deletions quickstart/coredhcp.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
services:
coredhcp:
image: ghcr.io/openchami/coresmd:v0.0.5
container_name: coredhcp
hostname: coredhcp
network_mode: host
cap_add:
- NET_ADMIN
volumes:
- ./configs/coredhcp.yaml:/etc/coredhcp/config.yaml:ro
- step-root-ca:/root_ca/:ro
command:
- "-L"
- "debug"
healthcheck:
test: pgrep coredhcp
interval: 5s
timeout: 10s
retries: 60
depends_on:
smd:
condition: service_healthy
Loading