-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sled-agent] NTP zone config set up via zone-setup CLI #5440
[sled-agent] NTP zone config set up via zone-setup CLI #5440
Conversation
illumos-utils/src/chronyd.rs
Outdated
pub fn start_daemon(file: &str) -> Result<(), ExecutionError> { | ||
let mut cmd = std::process::Command::new(PFEXEC); | ||
// TODO: This doesn't seem to be working. I think `execute()` | ||
// doesn't like the "&", and it immediately exits after running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The &
is a shell construct, but here you're not using a shell, you're just making arguments to pass to an execve(2) call ultimately. It seems like the original method script is doing this because we're using the -d
flag for chrony, which has the positive effect of using stdout for logging (which we want) but the negative effect of then not daemonising itself.
What I would do here is use the exec() method on Rust's Command object, to replace this process with chrony. I would then change the service duration to child for this service manifest, so that SMF expects we'll keep running forever without detaching.
Also no pfexec should be necessary, we can arrange the correct set of credentials and privileges in the SMF service's method context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option, particularly since I think it makes sense to keep the method script for the stop and refresh methods so they can properly manage the contract using the framework from /lib/svc/share/smf_include.sh
, is to make a new chrony setup service that does the configuration, and make the existing chrony service dependant on that. That could all still be done within the one manifest - an example of a manifest with multiple service instances is https://github.com/omniosorg/cloud-init/blob/illumos/smf/cloud-init.xml; in this case all of the instances share the stop and refresh methods, but that isn't mandatory.
This would possibly be better from a privilege separation perspective too. The current service manifest starts the service as the root user, but without full root privileges - it actually starts by applying the basic
set, removing some, and then adding a few which are required for the NTP daemon. These aren't needed for the setup script which effectively just needs to write to a couple of files - for example it definitely doesn't need/shouldn't have sys_time
.
<exec_method type="method" name="start" exec="/var/svc/method/svc-site-ntp %m"
timeout_seconds="60">
<method_context security_flags="aslr">
<method_credential user="root" group="root" privileges="basic,!file_link_any,!proc_info,!proc_session,file_chown_self,file_dac_search,file_dac_write,net_privaddr,proc_lock_memory,proc_priocntl,proc_setid,sys_time" />
</method_context>
</exec_method>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks both! 🙇♀️
Another option, particularly since I think it makes sense to keep the method script for the stop and refresh methods so they can properly manage the contract using the framework from /lib/svc/share/smf_include.sh, is to make a new chrony setup service that does the configuration, and make the existing chrony service dependant on that.
Hmm, this makes sense. I'm assuming that I will the be able to safely remove generate_config_file
from the refresh method and it will automatically generate the config file as the ntp service now depends on the chrony setup service?
refresh)
generate_config_file && stop_daemon
# SMF will restart the service since the contract is now empty.
;;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving configuration file generation to a separate transient duration service makes sense to me. I am unclear on the value of the refresh method here, such as it is. If you're just going to kill the program, just restart the service? Then you probably don't need any of the shell at all, I suspect you could just execute chrony directly with the child duration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I looked at the cloud-init service bundle and it's probably not exactly the example I would follow here. The configuration generation and the actual time daemon are notionally different services, rather than instances of the same service. If you're going to co-locate them in a bundle they should be separate services at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhhh!!! gotcha. @citrus-it, what do you think? Ha! looks like we both answered at the same time 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @citrus-it !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, in the end it was not from OmniOS - there there is just a simple kill. Same in OpenIndiana.
I don't remember why I added this in the service delivered from omicron, I presumably came across an instance where using kill as the stop method put the service into maintenance with an escaped process. It's usually to work around the race between killing the contract and the contract member forking - such as happens frequently with sendmail. I don't think chrony
forks much once it's up and running though, especially as we don't configure it to invoke notification scripts. In fact we remove the ability to fork/exec at all from one of its processes.
Long story short, we can probably lose the method script entirely and just have:
<exec_method type="method"
name="start"
exec="/usr/sbin/chronyd -d &"
timeout_seconds="600">
...
<exec_method type="method"
name="stop"
exec=":kill"
timeout_seconds="60" />
and a dependency on the new chrony-configure
service that is responsible for setting things up before chrony starts.
The refresh method was originally for forcing a regeneration of config files and restarting the daemon, after the control plane had updated the various config/
properties after a change, and issued a refresh.
This isn't wired up today, but it would still be good to arrange things so that this can work as expected, because we will one day support operators changing the upstream NTP servers etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation @citrus-it 🙇♀️! It's interesting to learn more about Illumos.
Long story short, we can probably lose the method script entirely and just have:
sweet! I'll adapt the NTP service and corresponding new chrony-setup service then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the help both! I've set the PR as ready to review :)
I've tested locally on a Helios machine: chrony-setup log
Chrony generated config
chrony.logadm.conf permissions
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is now zone-setup/src/bin/zone-setup.rs. Nothing really changed apart from adding the new chrony-setup command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @karencfv, this is looking really clean and I think splitting the services like this is going to work well.
I have left a few suggestions and comments, hopefully they are useful and make sense.
<service_fmri value='svc:/milestone/multi-user:default' /> | ||
</dependency> | ||
|
||
<exec_method type='method' name='start' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, but you could choose to reduce privileges here to just what is needed for the service. It's only writing out some config files so there are a lot it doesn't need. I'm happy to help with that if you want to consider it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Not sure which permissions these would be though. I'm guessing file_chown_self
, file_dac_search
and file_dac_write
; any others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ppriv
utility has a handy debugging mode to show you which privileges are missing.
root@oxz_ntp_cdc8cbd0:~# ppriv -De chown root:sys /etc/logadm.d/chrony.logadm.conf
chown[641328]: missing privilege "file_chown" (euid = 0, syscall = 16) needed at secpolicy_vnode_chown+0x33
chown: /etc/logadm.d/chrony.logadm.conf: Not owner
So we need at least basic,file_chown
. Since /etc/inet
and /etc/inet/chrony*
are owned by root, that might actually be all.
The _dac_
privileges allow access to things you don't own.
Basic expands out to a few too, so:
root@oxz_ntp_cdc8cbd0:~# ppriv $$
641141: -bash
flags = PRIV_AWARE
E: basic,file_chown
I: basic,file_chown
is actually:
root@oxz_ntp_cdc8cbd0:~# ppriv -v $$
641141: -bash
flags = PRIV_AWARE
E: basic_test,file_chown,file_link_any,file_read,file_write,net_access,proc_exec,proc_fork,proc_info,proc_secflags,proc_session
I: basic_test,file_chown,file_link_any,file_read,file_write,net_access,proc_exec,proc_fork,proc_info,proc_secflags,proc_session
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for doing this part!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As always, thanks for the detailed information. That utility is really handy!
<exec_method type='method' name='start' | ||
exec='/opt/oxide/zone-setup-cli/bin/zone-setup chrony-setup -b %{config/boundary} -s %{config/server} -a %{config/allow}' | ||
timeout_seconds='0' /> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ties into my comment on the dependency's restart_on
value below.
What is the eventual plan for the control plane dynamically reconfiguring this service? We will at some point allow an operator to set a new config/server at least through the API.
If, for example, the control plane will update the config/server
property, and then issue a refresh
of this service, the refresh method should also run the same command as start to regenerate the files. In this case, I would set the restart_on
property of the chrony-setup
dependency in the ntp:default
service to refresh
(see comment below), so that a refresh of the chrony-setup service triggers a restart of chrony.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I'll change the file
zone-setup/src/bin/zone-setup.rs
Outdated
.arg( | ||
arg!(-f --file <String> "Chrony configuration file") | ||
.default_value(CHRONY_CONFIG_FILE) | ||
.value_parser(parse_chrony_conf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it looks like indentation has changed here, and below for -b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ha! looks like cargo fmt
can't be trusted fully 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It gets extremely confused by macros with artisanal syntax like those from the clap crate.
zone-setup/src/bin/zone-setup.rs
Outdated
Ok(()) | ||
} | ||
|
||
async fn set_permissions_for_logadm_config() -> Result<(), CmdError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs to be an async function. Same for some of the others below but I won't comment on all of them.
zone-setup/src/bin/zone-setup.rs
Outdated
|
||
if old_file.clone().is_some_and(|f| f != new_config) { | ||
info!(&log, "Chrony configuration file has changed"; | ||
"old configuration file" => ?old_file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add new_config
to this log statement? It will be useful if we ever need to go back through logs and look at what was written by earlier service starts.
zone-setup/src/bin/zone-setup.rs
Outdated
let mut contents = contents.replace( | ||
"@ALLOW@", | ||
&allow.expect("Chrony allowed address not supplied").to_string(), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd perhaps do this above this block, not just for boundaries. I know that the @ALLOW@
token is only in the boundary template today, but we should probably treat all variables as usable in all templates.
smf/ntp/manifest/manifest.xml
Outdated
|
||
<exec_method type="method" name="start" | ||
exec='/usr/sbin/chronyd -d &' | ||
timeout_seconds="600"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the 600 came from me (I copied it over from OmniOS) but I think it's excessive
(I've opened omniosorg/omnios-build#3542 over there to reduce it down to 60).
The history of it is probably from older NTP services (pre-chrony) that did a one-off system clock set before starting the daemon, and therefore could take a while to complete. Chrony should never take this long and the service failing fairly promptly if it is not going to come up is generally desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB: Because you're using the ampersand, the timeout is almost irrelevant; the shell will put it in the background and exit immediately, whether it started correctly or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ha! good to know
zone-setup/src/bin/zone-setup.rs
Outdated
)) | ||
})?; | ||
|
||
chown(LOGADM_CONFIG_FILE, Some(0), Some(3)).map_err(|err| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know these UIDs shouldn't change, but you should ideally look up root
and sys
One option is { git = "https://github.com/papertigers/rust-users", branch = "illumos" }
as the users
crate appears to be abandoned - a PR to add illumos support hasn't been integrated despite being opened over two years ago.
You could also fork this to the oxidecomputer org (Cc @papertigers) and use it from there.
build% rg users Cargo.toml
9:users = { git = "https://github.com/papertigers/rust-users", branch = "illumos", version = "0.11.0" }
build% cat src/main.rs
fn main() {
println!("{:?}", users::get_user_by_name("root"));
println!("{:?}", users::get_group_by_name("sys"));
}
build% cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.05s
Running `target/debug/usertest`
Some(User(0, root))
Some(Group(3, sys))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gather there is at least one new crate, e.g., uzers, which is a fork with more recent updates. Not sure if that's considered the best current one in the ecosystem or if there is another more prominent one. It's possible that the nix crate is also fine here, though I don't think I've used it myself.
contents | ||
} else { | ||
for s in servers { | ||
let str_line = format!("server {} iburst minpoll 0 maxpoll 4\n", s); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mixed feelings about splitting this out to here as now some of the configuration values are in the template and some are in this code.
The original script used the appropriate @SERVER@
line from the template and then replicated it for each configured server, which kept everything in one place and the templates were effectively complete.
Would you consider putting the lines back in the template and extracting them here to use as the pattern for each server/pool line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking a bit about this one. I don't love tweaking a template's contents other than replacing values. What do you think about removing the templates altogether and creating the files entirely through the CLI like I've done for these lines?
Or is it convention in Illumos to have templates in the /etc/inet directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's an OS thing, per se, just something we were doing for our (Oxide's) NTP stuff earlier. I think it's probably fine to do either of generate the whole file in this program and just write it out, or use the template completely except for the values like @SERVER@
that we cannot know in advance. The half-way state is probably what isn't as ideal.
Co-authored-by: Andy Fiddaman <[email protected]>
Thanks for the through review @citrus-it 🙇♀️ ! I'll address the comments shortly. I'll be making the journey back to NZ soon, so it may take a few days to update the PR. |
Thanks for the reviews and help @jclulow @citrus-it ! I think I've addressed all comments. For now I've used the "uzers" crate to retrieve uid and gid. It seems fine to me, but if anyone has a preferred crate I should use please let me know! I've tested the changes in a Helios machine: chrony-setup logroot@oxz_ntp_6b1c83e7:~# cat /var/svc/log/oxide-chrony-setup:default.log | looker
[ Apr 18 04:54:34 Enabled. ]
[ Apr 18 04:54:34 Rereading configuration. ]
[ Apr 18 04:54:34 Rereading configuration. ]
[ Apr 18 04:54:35 Executing start method ("/opt/oxide/zone-setup-cli/bin/zone-setup chrony-setup -b true -s 0.pool.ntp.org -a fd00:1122:3344:100::/56"). ]
note: configured to log to "/dev/stderr"
servers: ["0.pool.ntp.org"]
file: /etc/inet/chrony.conf
allow: Some("fd00:1122:3344:100::/56")
boundary: true
04:54:35.408Z INFO zone-setup: Chrony configuration file has changed
new configuration file = "#\\n# Configuration file for a boundary NTP server - one which communicates with\\n# NTP servers outside the rack.\\n#\\n\\ndriftfile /var/lib/chrony/drift\\nntsdumpdir /var/lib/chrony\\ndumpdir /var/lib/chrony\\npidfile /var/run/chrony/chronyd.pid\\nlogdir /var/log/chrony\\n\\nlog measurements statistics tracking\\n\\nallow fe80::/10\\nallow fd00:1122:3344:100::/56\\n\\n# Enable local reference mode, which keeps us operating as an NTP server that\\n# appears synchronised even if there are currently no active upstreams. When\\n# in this mode, we report as stratum 10 to clients.\\nlocal stratum 10\\n\\n# makestep <threshold> <limit>\\n# We allow chrony to step the system clock during the first three time updates\\n# if we are more than 0.1 seconds out.\\nmakestep 0.1 3\\n\\n# When a leap second occurs we slew the clock over approximately 37 seconds.\\nleapsecmode slew\\nmaxslewrate 2708.333\\n\\npool 0.pool.ntp.org iburst maxdelay 0.1 maxsources 16\\n"
old configuration file = Some("### COMMENTS\\n# Any of the following lines are comments (you have a choice of\\n# comment start character):\\n# a comment\\n% a comment\\n! a comment\\n; a comment\\n#\\n# Below, the '!' form is used for lines that you might want to\\n# uncomment and edit to make your own chrony.conf file.\\n#\\n#######################################################################\\n#######################################################################\\n### SPECIFY YOUR NTP SERVERS\\n# Most computers using chrony will send measurement requests to one or\\n# more 'NTP servers'. You will probably find that your Internet Service\\n# Provider or company have one or more NTP servers that you can specify.\\n# Failing that, there are a lot of public NTP servers. There is a list\\n# you can access at http://support.ntp.org/bin/view/Servers/WebHome or\\n# you can use servers from the pool.ntp.org project.\\n\\n! server foo.example.net iburst\\n! server bar.example.net iburst\\n! server baz.example.net iburst\\n\\npool 0.omnios.pool.ntp.org iburst\\n\\n#######################################################################\\n### AVOIDING POTENTIALLY BOGUS CHANGES TO YOUR CLOCK\\n#\\n# To avoid changes being made to your computer's gain/loss compensation\\n# when the measurement history is too erratic, you might want to enable\\n# one of the following lines. The first seems good with servers on the\\n# Internet, the second seems OK for a LAN environment.\\n\\n! maxupdateskew 100\\n! maxupdateskew 5\\n\\n# If you want to increase the minimum number of selectable sources\\n# required to update the system clock in order to make the\\n# synchronisation more reliable, uncomment (and edit) the following\\n# line.\\n\\n! minsources 2\\n\\n# If your computer has a good stable clock (e.g. it is not a virtual\\n# machine), you might also want to reduce the maximum assumed drift\\n# (frequency error) of the clock (the value is specified in ppm).\\n\\n! maxdrift 100\\n\\n# By default, chronyd allows synchronisation to an unauthenticated NTP\\n# source (i.e. specified without the nts and key options) if it agrees with\\n# a majority of authenticated NTP sources, or if no authenticated source is\\n# specified. If you don't want chronyd to ever synchronise to an\\n# unauthenticated NTP source, uncomment the first from the following lines.\\n# If you don't want to synchronise to an unauthenticated NTP source only\\n# when an authenticated source is specified, uncomment the second line.\\n# If you want chronyd to ignore authentication in the source selection,\\n# uncomment the third line.\\n\\n! authselectmode require\\n! authselectmode prefer\\n! authselectmode ignore\\n\\n#######################################################################\\n### FILENAMES ETC\\n# Chrony likes to keep information about your computer's clock in files.\\n# The 'driftfile' stores the computer's clock gain/loss rate in parts\\n# per million. When chronyd starts, the system clock can be tuned\\n# immediately so that it doesn't gain or lose any more time. You\\n# generally want this, so it is uncommented.\\n\\ndriftfile /var/lib/chrony/drift\\n\\n# If you want to enable NTP authentication with symmetric keys, you will need\\n# to uncomment the following line and edit the file to set up the keys.\\n\\n! keyfile /etc/inet/chrony.keys\\n\\n# If you specify an NTP server with the nts option to enable authentication\\n# with the Network Time Security (NTS) mechanism, or enable server NTS with\\n# the ntsservercert and ntsserverkey directives below, the following line will\\n# allow the client/server to save the NTS keys and cookies in order to reduce\\n# the number of key establishments (NTS-KE sessions).\\n\\nntsdumpdir /var/lib/chrony\\n\\n# If chronyd is configured to act as an NTP server and you want to enable NTS\\n# for its clients, you will need a TLS certificate and private key. Uncomment\\n# and edit the following lines to specify the locations of the certificate and\\n# key.\\n\\n! ntsservercert /etc/.../foo.example.net.crt\\n! ntsserverkey /etc/.../foo.example.net.key\\n\\n# chronyd can save the measurement history for the servers to files when\\n# it exits. This is useful:\\n#\\n# 1. If you stop chronyd and restart it with the '-r' option (e.g. after\\n# an upgrade), the old measurements will still be relevant when chronyd\\n# is restarted. This will reduce the time needed to get accurate\\n# gain/loss measurements.\\n#\\n# Uncomment the following line to use this.\\n\\n! dumpdir /var/lib/chrony\\n\\n# chronyd writes its process ID to a file. If you try to start a second\\n# copy of chronyd, it will detect that the process named in the file is\\n# still running and bail out. If you want to change the path to the PID\\n# file, uncomment this line and edit it. The default path is shown.\\n\\npidfile /var/run/chrony/chronyd.pid\\n\\n# If the system timezone database is kept up to date and includes the\\n# right/UTC timezone, chronyd can use it to determine the current\\n# TAI-UTC offset and when will the next leap second occur.\\n\\n! leapsectz right/UTC\\n\\n# This directive specifies the location of the Samba ntp_signd socket\\n# when it is running as a Domain Controller (DC). If chronyd is\\n# compiled with this feature, responses to MS-SNTP clients will be\\n# signed by the smbd daemon.\\n\\n! ntpsigndsocket /var/lib/samba/ntp_signd\\n\\n#######################################################################\\n### INITIAL CLOCK CORRECTION\\n# This option is useful to quickly correct the clock on start if it's\\n# off by a large amount. The value '1.0' means that if the error is less\\n# than 1 second, it will be gradually removed by speeding up or slowing\\n# down your computer's clock until it is correct. If the error is above\\n# 1 second, an immediate time jump will be applied to correct it. The\\n# value '3' means the step is allowed only in the first three updates of\\n# the clock. Some software can get upset if the system clock jumps\\n# (especially backwards), so be careful!\\n\\n! makestep 1.0 3\\n\\n#######################################################################\\n### LEAP SECONDS\\n# A leap second is an occasional one-second correction of the UTC\\n# time scale. By default, chronyd tells the kernel to insert/delete\\n# the leap second, which makes a backward/forward step to correct the\\n# clock for it. As with the makestep directive, this jump can upset\\n# some applications. If you prefer chronyd to make a gradual\\n# correction, causing the clock to be off for a longer time, uncomment\\n# the following line.\\n\\n! leapsecmode slew\\n\\n#######################################################################\\n### LOGGING\\n# If you want to log information about the time measurements chronyd has\\n# gathered, you might want to enable the following lines. You probably\\n# only need this if you really enjoy looking at the logs, you want to\\n# produce some graphs of your system's timekeeping performance, or you\\n# need help in debugging a problem.\\n\\n! logdir /var/log/chrony\\n! log measurements statistics tracking\\n\\n# If you have real time clock support enabled (see below), you might want\\n# this line instead:\\n\\n! log measurements statistics tracking rtc\\n\\n#######################################################################\\n### ACTING AS AN NTP SERVER\\n# You might want the computer to be an NTP server for other computers.\\n#\\n# By default, chronyd does not allow any clients to access it. You need\\n# to explicitly enable access using 'allow' and 'deny' directives.\\n#\\n# e.g. to enable client access from the 192.168.*.* class B subnet,\\n\\n! allow 192.168/16\\n\\n# .. but disallow the 192.168.100.* subnet of that,\\n\\n! deny 192.168.100/24\\n\\n# You can have as many allow and deny directives as you need. The order\\n# is unimportant.\\n\\n# If you want to present your computer's time for others to synchronise\\n# with, even if you don't seem to be synchronised to any NTP servers\\n# yourself, enable the following line. The value 10 may be varied\\n# between 1 and 15. You should avoid small values because you will look\\n# like a real NTP server. The value 10 means that you appear to be 10\\n# NTP 'hops' away from an authoritative source (atomic clock, GPS\\n# receiver, radio clock etc).\\n\\n! local stratum 10\\n\\n# Normally, chronyd will keep track of how many times each client\\n# machine accesses it. The information can be accessed by the 'clients'\\n# command of chronyc. You can disable this facility by uncommenting the\\n# following line. This will save a bit of memory if you have many\\n# clients and it will also disable support for the interleaved mode.\\n\\n! noclientlog\\n\\n# The clientlog size is limited to 512KB by default. If you have many\\n# clients, you might want to increase the limit.\\n\\n! clientloglimit 4194304\\n\\n# By default, chronyd tries to respond to all valid NTP requests from\\n# allowed addresses. If you want to limit the response rate for NTP\\n# clients that are sending requests too frequently, uncomment and edit\\n# the following line.\\n\\n! ratelimit interval 3 burst 8\\n\\n#######################################################################\\n### REPORTING BIG CLOCK CHANGES\\n# Perhaps you want to know if chronyd suddenly detects any large error\\n# in your computer's clock. This might indicate a fault or a problem\\n# with the server(s) you are using, for example.\\n#\\n# The next option causes a message to be written to syslog when chronyd\\n# has to correct an error above 0.5 seconds (you can use any amount you\\n# like).\\n\\n! logchange 0.5\\n\\n# The next option will send email to the named person when chronyd has\\n# to correct an error above 0.5 seconds. (If you need to send mail to\\n# several people, you need to set up a mailing list or sendmail alias\\n# for them and use the address of that.)\\n\\n! mailonchange [email protected] 0.5\\n\\n#######################################################################\\n### COMMAND ACCESS\\n# The program chronyc is used to show the current operation of chronyd\\n# and to change parts of its configuration whilst it is running.\\n\\n# By default chronyd binds to the loopback interface. Uncomment the\\n# following lines to allow receiving command packets from remote hosts.\\n\\n! bindcmdaddress 0.0.0.0\\n! bindcmdaddress ::\\n\\n# Normally, chronyd will only allow connections from chronyc on the same\\n# machine as itself. This is for security. If you have a subnet\\n# 192.168.*.* and you want to be able to use chronyc from any machine on\\n# it, you could uncomment the following line. (Edit this to your own\\n# situation.)\\n\\n! cmdallow 192.168/16\\n\\n# You can add as many 'cmdallow' and 'cmddeny' lines as you like. The\\n# syntax and meaning is the same as for 'allow' and 'deny', except that\\n# 'cmdallow' and 'cmddeny' control access to the chronyd's command port.\\n\\n# Rate limiting can be enabled also for command packets. (Note,\\n# commands from localhost are never limited.)\\n\\n! cmdratelimit interval -4 burst 16\\n\\n\\n")
04:54:35.408Z INFO zone-setup: Setting mode 444 and root:sys ownership to logadm fragment file
logadm config = "/etc/logadm.d/chrony.logadm.conf"
04:54:35.408Z INFO zone-setup: Updating logadm
logadm config = "/etc/logadm.d/chrony.logadm.conf"
[ Apr 18 04:54:35 Method "start" exited with status 0. ] Chrony generated configroot@oxz_ntp_6b1c83e7:~# cat /etc/inet/chrony.conf
#
# Configuration file for a boundary NTP server - one which communicates with
# NTP servers outside the rack.
#
driftfile /var/lib/chrony/drift
ntsdumpdir /var/lib/chrony
dumpdir /var/lib/chrony
pidfile /var/run/chrony/chronyd.pid
logdir /var/log/chrony
log measurements statistics tracking
allow fe80::/10
allow fd00:1122:3344:100::/56
# Enable local reference mode, which keeps us operating as an NTP server that
# appears synchronised even if there are currently no active upstreams. When
# in this mode, we report as stratum 10 to clients.
local stratum 10
# makestep <threshold> <limit>
# We allow chrony to step the system clock during the first three time updates
# if we are more than 0.1 seconds out.
makestep 0.1 3
# When a leap second occurs we slew the clock over approximately 37 seconds.
leapsecmode slew
maxslewrate 2708.333
pool 0.pool.ntp.org iburst maxdelay 0.1 maxsources 16 chrony.logadm.conf permissionsroot@oxz_ntp_6b1c83e7:~# ls -l /etc/logadm.d/chrony.logadm.conf
-r--r--r-- 1 root sys 135 Apr 18 04:54 /etc/logadm.d/chrony.logadm.conf Running command as if it was an internal NTP zoneroot@oxz_ntp_6b1c83e7:~# /opt/oxide/zone-setup-cli/bin/zone-setup chrony-setup -b false -s some-internal-ntp-server -a
note: configured to log to "/dev/stderr"
servers: ["some-internal-ntp-server"]
file: /etc/inet/chrony.conf
allow: None
boundary: false
{"msg":"Chrony configuration file has changed","v":0,"name":"zone-setup","level":30,"time":"2024-04-18T05:07:29.378769822Z","hostname":"oxz_ntp_6b1c83e7-0b27-4d79-b402-6df2bc977334","pid":27408,"new configuration file":"\"#\\n# Configuration file for an internal NTP server - one which communicates with\\n# boundary NTP servers within the rack.\\n#\\n\\ndriftfile /var/lib/chrony/drift\\nntsdumpdir /var/lib/chrony\\ndumpdir /var/lib/chrony\\npidfile /var/run/chrony/chronyd.pid\\nlogdir /var/log/chrony\\n\\nlog measurements statistics tracking\\n\\n# makestep <threshold> <limit>\\n# We allow chrony to step the system clock if we are more than a day out,\\n# regardless of how many clock updates have occurred since boot.\\n# The boundary NTP servers are configured with local reference mode, which\\n# means that if they start up without external connectivity, they will appear\\n# as authoritative servers even if they are advertising January 1987\\n# (which is the default system clock on a gimlet after boot).\\n# This configuration allows a one-off adjustment once RSS begins and the\\n# boundary servers are synchronised, after which the clock will advance\\n# monotonically forwards.\\nmakestep 86400 -1\\n\\n# When a leap second occurs we slew the clock over approximately 37 seconds.\\nleapsecmode slew\\nmaxslewrate 2708.333\\n\\nserver some-internal-ntp-server iburst minpoll 0 maxpoll 4\\n\"","old configuration file":"Some(\"#\\n# Configuration file for a boundary NTP server - one which communicates with\\n# NTP servers outside the rack.\\n#\\n\\ndriftfile /var/lib/chrony/drift\\nntsdumpdir /var/lib/chrony\\ndumpdir /var/lib/chrony\\npidfile /var/run/chrony/chronyd.pid\\nlogdir /var/log/chrony\\n\\nlog measurements statistics tracking\\n\\nallow fe80::/10\\nallow fd00:1122:3344:100::/56\\n\\n# Enable local reference mode, which keeps us operating as an NTP server that\\n# appears synchronised even if there are currently no active upstreams. When\\n# in this mode, we report as stratum 10 to clients.\\nlocal stratum 10\\n\\n# makestep <threshold> <limit>\\n# We allow chrony to step the system clock during the first three time updates\\n# if we are more than 0.1 seconds out.\\nmakestep 0.1 3\\n\\n# When a leap second occurs we slew the clock over approximately 37 seconds.\\nleapsecmode slew\\nmaxslewrate 2708.333\\n\\npool 0.pool.ntp.org iburst maxdelay 0.1 maxsources 16\\n\")"}
{"msg":"Setting mode 444 and root:sys ownership to logadm fragment file","v":0,"name":"zone-setup","level":30,"time":"2024-04-18T05:07:29.38603525Z","hostname":"oxz_ntp_6b1c83e7-0b27-4d79-b402-6df2bc977334","pid":27408,"logadm config":"\"/etc/logadm.d/chrony.logadm.conf\""}
{"msg":"Updating logadm","v":0,"name":"zone-setup","level":30,"time":"2024-04-18T05:07:29.386051601Z","hostname":"oxz_ntp_6b1c83e7-0b27-4d79-b402-6df2bc977334","pid":27408,"logadm config":"\"/etc/logadm.d/chrony.logadm.conf\""} Chrony generated internal NTP configroot@oxz_ntp_6b1c83e7:~# cat /etc/inet/chrony.conf
#
# Configuration file for an internal NTP server - one which communicates with
# boundary NTP servers within the rack.
#
driftfile /var/lib/chrony/drift
ntsdumpdir /var/lib/chrony
dumpdir /var/lib/chrony
pidfile /var/run/chrony/chronyd.pid
logdir /var/log/chrony
log measurements statistics tracking
# makestep <threshold> <limit>
# We allow chrony to step the system clock if we are more than a day out,
# regardless of how many clock updates have occurred since boot.
# The boundary NTP servers are configured with local reference mode, which
# means that if they start up without external connectivity, they will appear
# as authoritative servers even if they are advertising January 1987
# (which is the default system clock on a gimlet after boot).
# This configuration allows a one-off adjustment once RSS begins and the
# boundary servers are synchronised, after which the clock will advance
# monotonically forwards.
makestep 86400 -1
# When a leap second occurs we slew the clock over approximately 37 seconds.
leapsecmode slew
maxslewrate 2708.333
server some-internal-ntp-server iburst minpoll 0 maxpoll 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for the changes and consolidating everything in one file.
illumos-utils/src/zone.rs
Outdated
@@ -17,6 +17,7 @@ use crate::dladm::{EtherstubVnic, VNIC_PREFIX_BOOTSTRAP, VNIC_PREFIX_CONTROL}; | |||
use crate::{execute, PFEXEC}; | |||
use omicron_common::address::SLED_PREFIX; | |||
|
|||
pub const CHRONYD: &str = "/usr/sbin/chronyd"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need this line any more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! no, I'll update
<service_fmri value='svc:/milestone/multi-user:default' /> | ||
</dependency> | ||
|
||
<exec_method type='method' name='start' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for doing this part!
Overview
This PR repurposes the zone-network CLI into a zone-setup CLI, in order to remove as many zone start-up scripts as possible. This is also in preparation to use this zone-setup CLI with the self assembling switch zone.
Related: #1898
Todo