Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sled-agent] Self assembling switch zone #5593

Merged
merged 60 commits into from
Jul 22, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
d0a1123
Update the zone setup CLI to take several static addresses during com…
karencfv Apr 22, 2024
444e9fb
Set MGS service
karencfv Apr 22, 2024
b82297b
Set Wicket service (almost)
karencfv Apr 22, 2024
9378f6f
Merge branch 'main' into switch-zone-self-assembling
karencfv Apr 22, 2024
b83a46c
switch zone setup
karencfv Apr 23, 2024
5062a38
Set up wicket and support users via zone-setup CLI
karencfv Apr 23, 2024
68029f1
switch setup commands
karencfv Apr 24, 2024
be0f793
modify zone setup start method
karencfv Apr 29, 2024
223fd48
Set dendrite service
karencfv Apr 29, 2024
c233daa
Set tfport service
karencfv Apr 29, 2024
11dd38d
Set lldpd service
karencfv Apr 29, 2024
80c3724
Set pumpkind service
karencfv Apr 29, 2024
ead54fb
Set up Mgd
karencfv Apr 30, 2024
04b5037
Set up mg ddm service
karencfv Apr 30, 2024
4873d47
set up link local links during switch zone set up
karencfv May 1, 2024
a5d4dc1
Set up switch zone networking configuration
karencfv May 1, 2024
ac5bbbb
Clean up
karencfv May 2, 2024
c792fc7
Switch zone set up should not depend on common networking set up
karencfv May 3, 2024
89ac45a
Fix manifest and some fixes for the switch zone set up command
karencfv May 7, 2024
95ca659
Get bootstrap address working
karencfv May 9, 2024
73f0d04
enable all services from the start
karencfv May 10, 2024
5c6d64d
It works! 😭
karencfv May 10, 2024
29b1da4
all services successful on a4x2 testbed
karencfv May 15, 2024
be38d08
restart switch services' instances after updating properties
karencfv May 16, 2024
26137dc
refresh is enough
karencfv May 16, 2024
0f9f1d3
Merge branch 'main' into switch-zone-self-assembling
karencfv Jun 6, 2024
64b8e09
small fixes after merge
karencfv Jun 6, 2024
181bebe
Add sp-sim to PropertyGroupBuilder
karencfv Jun 6, 2024
91204f9
Update dendrite hashes
karencfv Jun 10, 2024
f509c43
First round of clean up
karencfv Jun 10, 2024
0d1ef88
fmt
karencfv Jun 10, 2024
845b07a
Clean up bootstrap related code
karencfv Jun 11, 2024
0fcd764
Clean up switch zone setup
karencfv Jun 11, 2024
ab75734
extract switch zone user
karencfv Jun 11, 2024
dafe3da
fmt and add files
karencfv Jun 11, 2024
2ea71b8
Add logs from PR #5853
karencfv Jun 11, 2024
06e1614
Tidy up
karencfv Jun 12, 2024
6221c39
remove switch zone setup bash script entirely
karencfv Jun 12, 2024
19bd61c
Verify ensure default route loop
karencfv Jun 12, 2024
1bd8e85
Remove more commented out code
karencfv Jun 12, 2024
d2f7ae0
Remove wicket service's dependency on common networking service
karencfv Jun 12, 2024
30eb1ee
Make sure to set all properties on instance FMRI
karencfv Jun 13, 2024
7753318
fix typo
karencfv Jun 16, 2024
86bef4d
Remove unecessary clone
karencfv Jun 17, 2024
ebb6fce
Clean up
karencfv Jun 17, 2024
70be78e
Address comment
karencfv Jun 17, 2024
45678c3
Adding SP sim to property builder isn't necessary and is cleaner on a…
karencfv Jun 18, 2024
17c4931
Modify start for when underlay is not available yet
karencfv Jun 20, 2024
2f14648
Add default route during second run of setting property values
karencfv Jun 25, 2024
9499512
Add some logging
karencfv Jun 26, 2024
5a1d23e
Set bootstrap address and link local in switch zone setup service
karencfv Jun 26, 2024
e3d40fa
Clean up
karencfv Jun 26, 2024
ab6ddf3
Include forwarding bootstrap traffic to switch zone start up service
karencfv Jun 27, 2024
207162b
Clean up
karencfv Jun 27, 2024
36c95ca
Clean up --gateway flag
karencfv Jun 27, 2024
fae95de
noop commit
karencfv Jun 27, 2024
2153e5c
Merge main into switch-zone-self-assembling
karencfv Jul 15, 2024
2f7b313
Update Dendrite hashes
karencfv Jul 15, 2024
1048969
Merge branch 'main' into switch-zone-self-assembling
karencfv Jul 19, 2024
4a269b3
update to latest dendrite commit
karencfv Jul 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
435 changes: 310 additions & 125 deletions sled-agent/src/services.rs

Large diffs are not rendered by default.

7 changes: 6 additions & 1 deletion smf/mgs/manifest.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,18 @@
<service_bundle type='manifest' name='mgs'>

<service name='oxide/mgs' type='service' version='1'>
<create_default_instance enabled='false' />
<create_default_instance enabled='true' />

<dependency name='multi_user' grouping='require_all' restart_on='none'
type='service'>
<service_fmri value='svc:/milestone/multi-user:default' />
</dependency>

<dependency name='zone_network_setup' grouping='require_all' restart_on='none'
type='service'>
<service_fmri value='svc:/oxide/zone-network-setup:default' />
</dependency>

Comment on lines +14 to +18
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure if this dependency is necessary or harmful. Could someone with more knowledge about MGS this confirm either way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dependency worked fine when deploying on Madrid. Is this enough testing to guarantee that it'll be fine on a real rack also?

<!--
Most omicron services run their binary under `ctrun` because they spawn
child processes that should be killed if the service is killed. However,
Expand Down
6 changes: 6 additions & 0 deletions smf/switch_zone_setup/manifest.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,16 @@
<service_fmri value='svc:/milestone/multi-user:default' />
</dependency>

<dependency name='zone_network_setup' grouping='require_all' restart_on='none'
type='service'>
<service_fmri value='svc:/oxide/zone-network-setup:default' />
</dependency>

<exec_method type='method' name='start' exec='/opt/oxide/bin/switch_zone_setup' timeout_seconds='300' />
<exec_method type='method' name='stop' exec=':true' timeout_seconds='3' />

<property_group name='startd' type='framework'>
<!-- TODO: Add propval for baseboard information -->
<propval name='duration' type='astring' value='transient' />
</property_group>

Expand Down
3 changes: 3 additions & 0 deletions smf/switch_zone_setup/switch_zone_setup
Original file line number Diff line number Diff line change
Expand Up @@ -46,4 +46,7 @@ for i in "${!USERS[@]}"; do
fi
done

# TODO: Call /opt/oxide/zone-setup-cli/bin/zone-setup wicket-setup -b BASEBOARD_INFO here
# Eventually we'll want all of the above code to be part of the zone-setup CLI as well

exit $SMF_EXIT_OK
Empty file added smf/wicketd/baseboard.json
Empty file.
17 changes: 16 additions & 1 deletion smf/wicketd/manifest.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,23 @@
<service_bundle type='manifest' name='wicketd'>

<service name='oxide/wicketd' type='service' version='1'>
<create_default_instance enabled='false' />
<create_default_instance enabled='true' />

<dependency name='multi_user' grouping='require_all' restart_on='none'
type='service'>
<service_fmri value='svc:/milestone/multi-user:default' />
</dependency>

<dependency name='zone_network_setup' grouping='require_all' restart_on='none'
type='service'>
<service_fmri value='svc:/oxide/zone-network-setup:default' />
</dependency>

<dependency name='switch_zone_setup' grouping='require_all' restart_on='none'
type='service'>
<service_fmri value='svc:/oxide/switch_zone_setup:default' />
</dependency>

<exec_method type='method' name='start'
exec='ctrun -l child -o noorphan,regent /opt/oxide/wicketd/bin/wicketd run /var/svc/manifest/site/wicketd/config.toml --address %{config/address} --artifact-address %{config/artifact-address} --mgs-address %{config/mgs-address} --nexus-proxy-address %{config/nexus-proxy-address} --baseboard-file %{config/baseboard-file} --read-smf-config &amp;'
timeout_seconds='0' />
Expand Down Expand Up @@ -45,6 +55,11 @@
<propval name='mgs-address' type='astring' value='unknown' />
<propval name='nexus-proxy-address' type='astring' value='unknown' />
<propval name='baseboard-file' type='astring' value='unknown' />
<!--
TODO: Remove this baseboard info and send it to
switch_zone_setup service instead?
-->
<propval name='baseboard-info' type='astring' value='unknown' />

<!--
In a standard deployment, this will remain `unknown` until rack setup
Expand Down
109 changes: 93 additions & 16 deletions zone-setup/src/bin/zone-setup.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,14 @@ use uzers::{get_group_by_name, get_user_by_name};
pub const HOSTS_FILE: &str = "/etc/inet/hosts";
pub const CHRONY_CONFIG_FILE: &str = "/etc/inet/chrony.conf";
pub const LOGADM_CONFIG_FILE: &str = "/etc/logadm.d/chrony.logadm.conf";
pub const WICKET_BASEBOARD_FILE: &str = "/var/svc/manifest/site/wicketd/baseboard.json";
pub const ROOT: &str = "root";
pub const SYS: &str = "sys";

pub const COMMON_NW_CMD: &str = "common-networking";
pub const OPTE_INTERFACE_CMD: &str = "opte-interface";
pub const CHRONY_SETUP_CMD: &str = "chrony-setup";
pub const WICKET_SETUP_CMD: &str = "wicket-setup";

fn parse_ip(s: &str) -> anyhow::Result<IpAddr> {
if s == "unknown" {
Expand Down Expand Up @@ -72,6 +74,14 @@ fn parse_chrony_conf(s: &str) -> anyhow::Result<String> {
s.parse().map_err(|_| anyhow!("ERROR: Invalid chrony configuration file"))
}

fn parse_wicket_conf(s: &str) -> anyhow::Result<String> {
if s == "" {
return Err(anyhow!("ERROR: Missing baseboard configuration file"));
};

s.parse().map_err(|_| anyhow!("ERROR: Invalid baseboard configuration file"))
}

fn parse_boundary(s: &str) -> anyhow::Result<bool> {
s.parse().map_err(|_| anyhow!("ERROR: Invalid boundary input"))
}
Expand Down Expand Up @@ -113,11 +123,14 @@ async fn do_run() -> Result<(), CmdError> {
.value_parser(parse_ipv6),
)
.arg(
arg!(
-s --static_addr <Ipv6Addr> "static_addr"
)
Arg::new("static_addrs")
.short('s')
.long("static_addrs")
.num_args(1..)
.value_delimiter(' ')
.value_parser(parse_ipv6)
.help("List of static addresses separated by a space")
.required(true)
.value_parser(parse_ipv6),
),
)
.subcommand(
Expand Down Expand Up @@ -145,6 +158,23 @@ async fn do_run() -> Result<(), CmdError> {
.value_parser(parse_ip),
),
)
.subcommand(
Command::new(WICKET_SETUP_CMD)
.about("Sets up Wicket configuration")
.arg(
arg!(
-b --baseboard_file <STRING> "baseboard_file"
)
.default_value(WICKET_BASEBOARD_FILE)
.value_parser(parse_wicket_conf),
)
.arg(
arg!(
-i --baseboard_info <STRING> "baseboard_info"
)
.value_parser(parse_wicket_conf),
),
)
.subcommand(
Command::new(CHRONY_SETUP_CMD)
.about("Sets up Chrony configuration for NTP zone")
Expand Down Expand Up @@ -187,6 +217,42 @@ async fn do_run() -> Result<(), CmdError> {
chrony_setup(matches, log.clone()).await?;
}

if let Some(matches) = matches.subcommand_matches(WICKET_SETUP_CMD) {
wicket_setup(matches, log.clone()).await?;
}

Ok(())
}

async fn wicket_setup(
matches: &ArgMatches,
log: Logger,
) -> Result<(), CmdError> {
let file: &String = matches.get_one("baseboard_file").unwrap();
let info: &String = matches.get_one("baseboard_info").unwrap();

info!(&log, "Generating baseboard.json file"; "baseboard file" => ?WICKET_BASEBOARD_FILE);

let mut config_file = OpenOptions::new()
.write(true)
.create(true)
.truncate(true)
.open(file)
.map_err(|err| {
CmdError::Failure(anyhow!(
"Could not create baseboard configuration file {}: {}",
file,
err
))
})?;
config_file.write(info.as_bytes()).map_err(|err| {
CmdError::Failure(anyhow!(
"Could not write to baseboard configuration file {}: {}",
file,
err
))
})?;

Ok(())
}

Expand Down Expand Up @@ -417,7 +483,10 @@ async fn common_nw_set_up(
log: Logger,
) -> Result<(), CmdError> {
let datalink: &String = matches.get_one("datalink").unwrap();
let static_addr: &Ipv6Addr = matches.get_one("static_addr").unwrap();
let static_addrs = matches
.get_many::<Ipv6Addr>("static_addrs")
.unwrap()
.collect::<Vec<_>>();
let gateway: Ipv6Addr = *matches.get_one("gateway").unwrap();
let zonename = zone::current().await.map_err(|err| {
CmdError::Failure(anyhow!(
Expand All @@ -436,26 +505,34 @@ async fn common_nw_set_up(
Ipadm::set_interface_mtu(&datalink)
.map_err(|err| CmdError::Failure(anyhow!(err)))?;

info!(&log, "Ensuring static and auto-configured addresses are set on the IP interface"; "data link" => ?datalink, "static address" => ?static_addr);
Ipadm::create_static_and_autoconfigured_addrs(&datalink, static_addr)
.map_err(|err| CmdError::Failure(anyhow!(err)))?;
for addr in &static_addrs {
info!(&log, "Ensuring static and auto-configured addresses are set on the IP interface"; "data link" => ?datalink, "static address" => ?addr);
Ipadm::create_static_and_autoconfigured_addrs(&datalink, addr)
.map_err(|err| CmdError::Failure(anyhow!(err)))?;
}

info!(&log, "Ensuring there is a default route"; "gateway" => ?gateway);
Route::ensure_default_route_with_gateway(Gateway::Ipv6(gateway))
.map_err(|err| CmdError::Failure(anyhow!(err)))?;

info!(&log, "Populating hosts file for zone"; "zonename" => ?zonename);
write(
HOSTS_FILE,
format!(
r#"
let mut hosts_contents = String::from(
r#"
::1 localhost loghost
127.0.0.1 localhost loghost
{static_addr} {zonename}.local {zonename}
"#,
);

for addr in static_addrs.clone() {
let s = format!(
r#"{addr} {zonename}.local {zonename}
"#
),
)
.map_err(|err| CmdError::Failure(anyhow!(err)))?;
);
hosts_contents.push_str(s.as_str())
}

write(HOSTS_FILE, hosts_contents)
.map_err(|err| CmdError::Failure(anyhow!(err)))?;

Ok(())
}
Expand Down
Loading