-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Container fixes #826
base: main
Are you sure you want to change the base?
Container fixes #826
Conversation
b19064e
to
5afa473
Compare
I have only looked at documentation changes (container.md and networking.md). Looks great! Found two typos as commented above. |
A sane interface name is at least two characters long, and in Linux the interface name (using ip link) is at most 15 characthers long. Signed-off-by: Joachim Wiberg <[email protected]>
Signed-off-by: Joachim Wiberg <[email protected]>
Container support in Infix was released with v24.02, so this change may unfortunately break a few use-cases out there. Regrettable as this is, the default behavior, including how containers are started after boot, break other use-cases that were considered more important. As of this commit: - all containers in Infix run in read-only mode, use volumes and mounts for persistence across reboot/stop/start/upgrade - all containers are now "recreated" at boot or related config changes, this ensures an OCI image embedded in the Infix image, /lib/oci/, is always used as the base for a running container Fixes #823 Signed-off-by: Joachim Wiberg <[email protected]>
On unclean shutdowns Frr leaves a lot of per-thread message buffers in /var/tmp/frr/<daemon>[-<instance>].<pid>/* See https://docs.frrouting.org/en/latest/setup.html Signed-off-by: Joachim Wiberg <[email protected]>
Running 'shred' on files stored on eMMC is pointless since the writes are spread out over other sectors rather than overwriting the content of the files as it was supposed to on old rotating media. Signed-off-by: Joachim Wiberg <[email protected]>
To be able to handle container restarts, incl. restart policy, at runtime, most of the container data lives in /var/lib/containers, which on most systems is backed by a persistent store. As of issue #823 we no longer keep a writable layer for containers, nor should we cache container state across reboots, all containers are recreated at boot. This task cleans up any lingering state. Signed-off-by: Joachim Wiberg <[email protected]>
- Reduce the amount of queues: 3 -> 1 - Simplify post hook - Refine execd The resulting simplification of infix_containers_post_hook(), and touching execd, also ensure container environment variable changes are propagated. Fixes #822 Signed-off-by: Joachim Wiberg <[email protected]>
- Anonymous FTP, or URL encoded ftp://user:hostname@addr/oci.tar.gz - HTTP/HTTPS fetched with curl, optional credentials support - Verify download against an optional sha256 checksum Ensure the unpacked directory name does not contain a ':', it is a restricted character and cannot be part of the file name. If this syntax is used we retain it as the name and retag it after load. Fix #801 Signed-off-by: Joachim Wiberg <[email protected]>
Issue #815 detail issues found running the Clixon Controllar and Cisco Yangsuite. The errors and warnings listed are very similar to pyang, which the undersigned has, the following changes fixes the pyang errors: - relocate 'feature containers' to submodule - drop already deviated ospf:database deviations - drop unused imports Signed-off-by: Joachim Wiberg <[email protected]>
Should be inverted to a --verbose or --debug flag instead. After this change we still see the full 'podman create ...' command, with all the optionas and arguments. Signed-off-by: Joachim Wiberg <[email protected]>
When an Infix device is connected to a LAN where the gateway has yet to connect to the Internet, the container script will fail pulling images from any remote server. Nov 16 12:48:13 infix container[3490]: Error: initializing source docker://ghcr.io/kernelkit/curios:edge: pinging container registry ghcr.io: Get "https://ghcr.io/v2/": dial tcp: lookup ghcr.io on 127.0.0.1:53: read udp 127.0.0.1:55422->127.0.0.1:53: i/o timeout Nov 16 12:48:13 infix container[3641]: Error: failed creating container fw, please check the configuration. Nov 16 12:48:13 infix execd[3490]: /run/containers/queue/S01-fw.sh failed, exit code: 1 Since execd until now only retries on netlink/inotify events, or manual SIGUSR1, jobs would get stuck even though Internet connectivity had been established. This patch fixes that with the addition of a retry timer which runs while there are pending jobs in the queue. Signed-off-by: Joachim Wiberg <[email protected]>
Disable the default "podman pull" retry value. We use execd to retry "podman create" on failure. Wihtout this change, a single container can block start of other containers by 3 * 20 seconds. Now we only block max 20 seconds before we try starting the next container. Modern versions of podman (>= 5.0) have this --retry option, but it does not have CNI, so this is a temporary workaround. Signed-off-by: Joachim Wiberg <[email protected]>
Instead of using $HOME, which may be a ramdisk, use /var/tmp which podman also uses by default. Also, make sure to clean up after ourselves. Signed-off-by: Joachim Wiberg <[email protected]>
This patch allows running the configure script manually to create and delete containers. The normal flow via confd has additional handling to ensure containers are started/stopped on inictl reload. Signed-off-by: Joachim Wiberg <[email protected]>
Aptly named to match the same command in the CLI used to edit binary YANG leaves, and also an example of how to script these types of settings. Signed-off-by: Joachim Wiberg <[email protected]>
This patch adds latest symlinks to the curiOS containers to make system upgrades easier. I.e., a user can now reference the bundled image with: set image oci-archive:/lib/oci/curios-httpd-latest.tar.gz So that when they upgrade to the latest Infix, which might include an update of curiOS httpd, they will get a seamless upgrade also of the container(s) running. Signed-off-by: Joachim Wiberg <[email protected]>
Signed-off-by: Joachim Wiberg <[email protected]>
Signed-off-by: Joachim Wiberg <[email protected]>
- No more default writable layer - Don't mention read-only (deprecated, and always on now) - Use ghfm note highlights Signed-off-by: Joachim Wiberg <[email protected]>
Signed-off-by: Joachim Wiberg <[email protected]>
Signed-off-by: Joachim Wiberg <[email protected]>
2b82a48
to
1014d1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HUGE refactor, great work! 🚀
I just have some small comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mightily impressed! Really great work!
🎉 🎉 🎉
leaf checksum { | ||
description "Checksum for ftp/http/https OCI archives (sha256sum)"; | ||
type string; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a pattern
to tighten up the allowed inputs?
If we call the leaf sha256sum
, we could more easily add other algorithms in the future (potentially we could wrap it in a choice
as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, that would be better, I'll have a look!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal:
container checksum {
description "Checksum for ftp/http/https OCI archives.";
choice checksum {
leaf md5 {
description "MD5 checksum for the archive.";
type string {
length "32";
pattern "[a-fA-F0-9]{32}";
}
}
leaf sha256 {
description "SHA256 checksum for the archive.";
type string {
length "64";
pattern "[a-fA-F0-9]{64}";
}
}
leaf sha512 {
description "SHA512 checksum for the archive.";
type string {
length "128";
pattern "[a-fA-F0-9]{128}";
}
}
}
}
Just learned that
case
is optional in YANG 🤯
This will get a CLI session like this:
admin@example:/config/container/web> set image ftp://host/archive.tar.gz
admin@example:/config/container/web> set checksum sha256 asdsadsadsadsadsa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! That looks great 👍
|
||
xpath=$1 | ||
if [ -z "$xpath" ]; then | ||
echo "Usage: text-editor \"/full/xpath/to/binary/leaf\"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
text-editor
seems a bit too generic for what this does, no? sr-bin-edit
? Even in a future where admin/exec is implemented in bash
, users would still use something like today's CLI for the config context, right? There text-editor
can be YANG-type-aware version, i.e., handle binaries, strings, lists etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, was aiming for recognition between CLI and shell, but you're right it's too generic in this case. (Unlike, e.g., the copy
command for instance.)
Not sure about the name sr-bin-edit
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some discussions in another forum we decided on the following:
- Convert the
cfg
alias into a shell script, - Add verb cases in a case/esac for operations like:
edit
: to edit binary nodesimport/export
: to workaround sysrepocfg's oddities in the option parsercopy
: between datastores
- with a default case that falls back to the old
exec sysrepocfg -f json "$@"
at every boot. It can be extended with writable volumes that survive | ||
both container and host system restarts, and upgrades. Another option | ||
is [Content Mounts](#content-mounts), for truly persistent content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"survive" tripped my swinglish-detector. Maybe something like:
at every boot. When non-volatile storage is needed, data stored in volumes will be persisted until explicitly removed from the configuration, i.e., across host and container reboots and upgrades.
Re: content mounts: are they really "truly persistent"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"survive" tripped my swinglish-detector. Maybe something like:
at every boot. When non-volatile storage is needed, data stored in volumes will be persisted until explicitly removed from the configuration, i.e., across host and container reboots and upgrades.
Much better, thanks!
Re: content mounts: are they really "truly persistent"?
Well it's the content data stored in startup-config
... ? But I can try to find a better wording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this:
The second creates a read-only container that is automatically started
at every boot. When non-volatile storage is needed, data stored in a
volume is persisted until explicitly removed from the configuration,
i.e., across host and container reboots and upgrades.Another option is Content Mounts, where the content
of a file mounted into the container is kept along with the container
configuration in the device'sstartup-config
.
Description
execd
with a timer callback while the queue is non-emptycontainer
helper script, retainpodman create
commandNote
Everyone, some helpful highlighting (like this one!) has been added to
containers.md
andnetworking.md
, please review. See available highlights here -> https://github.com/orgs/community/discussions/16925Checklist
Tick relevant boxes, this PR is-a or has-a: