Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better cleanup of networking resources #1352

Merged
merged 5 commits into from
Jul 8, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 6 additions & 8 deletions docs/how-to-run.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Any additional prerequisite software may be installed with the following script:
$ ./tools/install_prerequisites.sh
----

=== Make me a Gimlet!
=== Make (or unmake) me a Gimlet!

The sled agent expects to manage a real Gimlet. However, until those are built,
developers generally make do with something else, usually a commodity machine.
Expand All @@ -35,13 +35,11 @@ file-based ZFS vdevs and ZFS zpools on top of those, and a couple of VNICs. The
vdevs model the actual U.2s that will be in a Gimlet, and the VNICs model the
two Chelsio NIC ports.

You can clean up these resources with `./tools/destroy_virtual_hardware.sh`. Be
aware that this is a best effort script. If the sled agent or other Omicron
zones are still running, the zpools cannot be deleted. The script will warn you
about the things it cannot remove, so you can do so yourself. Also note that all
the networking bits are temporary, so a reboot should always clear them.

Both scripts must be run as root, e.g, `pfexec ./tools/create_virtual_hardware.sh`.
You can clean up these resources with `./tools/destroy_virtual_hardware.sh`.
This script requires Omicron be uninstalled, e.g., with `pfexec
./target/release/omicron-package uninstall`, and a warning will be printed if
that is not the case. The script will then remove the file-based vdevs and the
VNICs created by `create_virtual_hardware.sh`.

== Deploying Omicron

Expand Down
3 changes: 3 additions & 0 deletions package/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ reqwest = { version = "0.11", default-features = false, features = ["rustls-tls"
ring = "0.16"
serde = { version = "1.0", features = [ "derive" ] }
serde_derive = "1.0"
slog = { version = "2.5", features = [ "max_level_trace", "release_max_level_debug"] }
slog-async = "2.7"
slog-term = "2.9"
smf = "0.2"
tar = "0.4"
tempfile = "3.3"
Expand Down
67 changes: 45 additions & 22 deletions package/src/bin/omicron-package.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ use clap::{Parser, Subcommand};
use futures::stream::{self, StreamExt, TryStreamExt};
use indicatif::{MultiProgress, ProgressBar, ProgressStyle};
use omicron_package::{parse, BuildCommand, DeployCommand};
use omicron_sled_agent::cleanup_networking_resources;
use omicron_sled_agent::zone;
use omicron_zone_package::config::Config;
use omicron_zone_package::config::ExternalPackage;
Expand All @@ -17,6 +18,10 @@ use omicron_zone_package::package::Package;
use omicron_zone_package::package::Progress;
use rayon::prelude::*;
use ring::digest::{Context as DigestContext, Digest, SHA256};
use slog::info;
use slog::o;
use slog::Drain;
use slog::Logger;
use std::env;
use std::fs::create_dir_all;
use std::path::{Path, PathBuf};
Expand Down Expand Up @@ -298,6 +303,7 @@ fn do_install(
config: &Config,
artifact_dir: &Path,
install_dir: &Path,
log: &Logger,
) -> Result<()> {
create_dir_all(&install_dir).map_err(|err| {
anyhow!("Cannot create installation directory: {}", err)
Expand All @@ -323,10 +329,11 @@ fn do_install(

let src = tarfile.as_path();
let dst = install_dir.join(src.strip_prefix(artifact_dir)?);
println!(
"Installing Service: {} -> {}",
src.to_string_lossy(),
dst.to_string_lossy()
info!(
log,
"Installing service";
"src" => %src.to_string_lossy(),
"dst" => %dst.to_string_lossy(),
);
std::fs::copy(&src, &dst)?;
Ok(())
Expand All @@ -346,10 +353,11 @@ fn do_install(
for service_name in global_zone_service_names {
let tar_path = install_dir.join(format!("{}.tar", service_name));
let service_path = install_dir.join(service_name);
println!(
"Unpacking {} to {}",
tar_path.to_string_lossy(),
service_path.to_string_lossy()
info!(
log,
"Unpacking service tarball";
"tar_path" => %tar_path.to_string_lossy(),
"service_path" => %service_path.to_string_lossy(),
);

let tar_file = std::fs::File::open(&tar_path)?;
Expand All @@ -366,8 +374,9 @@ fn do_install(
.join(&package.service_name)
.join("pkg")
.join("manifest.xml");
println!(
"Installing bootstrap service from {}",
info!(
log,
"Installing boostrap service from {}",
manifest_path.to_string_lossy()
);
smf::Config::import().run(&manifest_path)?;
Expand Down Expand Up @@ -425,7 +434,11 @@ fn remove_all_unless_already_removed<P: AsRef<Path>>(path: P) -> Result<()> {
Ok(())
}

fn remove_all_except<P: AsRef<Path>>(path: P, to_keep: &[&str]) -> Result<()> {
fn remove_all_except<P: AsRef<Path>>(
path: P,
to_keep: &[&str],
log: &Logger,
) -> Result<()> {
let dir = match path.as_ref().read_dir() {
Ok(dir) => dir,
Err(e) if e.kind() == std::io::ErrorKind::NotFound => return Ok(()),
Expand All @@ -434,9 +447,9 @@ fn remove_all_except<P: AsRef<Path>>(path: P, to_keep: &[&str]) -> Result<()> {
for entry in dir {
let entry = entry?;
if to_keep.contains(&&*(entry.file_name().to_string_lossy())) {
println!(" Keeping: '{}'", entry.path().to_string_lossy());
info!(log, "Keeping: '{}'", entry.path().to_string_lossy());
} else {
println!(" Removing: '{}'", entry.path().to_string_lossy());
info!(log, "Removing: '{}'", entry.path().to_string_lossy());
if entry.metadata()?.is_dir() {
remove_all_unless_already_removed(entry.path())?;
} else {
Expand All @@ -447,27 +460,32 @@ fn remove_all_except<P: AsRef<Path>>(path: P, to_keep: &[&str]) -> Result<()> {
Ok(())
}

fn do_uninstall(
async fn do_uninstall(
config: &Config,
artifact_dir: &Path,
install_dir: &Path,
log: &Logger,
) -> Result<()> {
println!("Removing all Omicron zones");
info!(log, "Removing all Omicron zones");
uninstall_all_omicron_zones()?;
println!("Uninstalling all packages");
info!(log, "Uninstalling all packages");
uninstall_all_packages(config);
println!("Removing artifacts in: {}", artifact_dir.to_string_lossy());
info!(log, "Removing artifacts in: {}", artifact_dir.to_string_lossy());

const ARTIFACTS_TO_KEEP: &[&str] =
&["clickhouse", "cockroachdb", "xde", "console-assets", "downloads"];
remove_all_except(artifact_dir, ARTIFACTS_TO_KEEP)?;
remove_all_except(artifact_dir, ARTIFACTS_TO_KEEP, log)?;

println!(
info!(
log,
"Removing installed objects in: {}",
install_dir.to_string_lossy()
);
const INSTALLED_OBJECTS_TO_KEEP: &[&str] = &["opte"];
remove_all_except(install_dir, INSTALLED_OBJECTS_TO_KEEP)?;
remove_all_except(install_dir, INSTALLED_OBJECTS_TO_KEEP, log)?;

cleanup_networking_resources(log).await?;

Ok(())
}

Expand Down Expand Up @@ -545,6 +563,11 @@ async fn main() -> Result<()> {
let args = Args::try_parse()?;
let config = parse::<_, Config>(&args.manifest)?;

let decorator = slog_term::TermDecorator::new().build();
let drain = slog_term::CompactFormat::new(decorator).build().fuse();
let drain = slog_async::Async::new(drain).build().fuse();
let log = slog::Logger::root(drain, o!());

// Use a CWD that is the root of the Omicron repository.
if let Ok(manifest) = env::var("CARGO_MANIFEST_DIR") {
let manifest_dir = PathBuf::from(manifest);
Expand All @@ -561,13 +584,13 @@ async fn main() -> Result<()> {
artifact_dir,
install_dir,
}) => {
do_install(&config, &artifact_dir, &install_dir)?;
do_install(&config, &artifact_dir, &install_dir, &log)?;
}
SubCommand::Deploy(DeployCommand::Uninstall {
artifact_dir,
install_dir,
}) => {
do_uninstall(&config, &artifact_dir, &install_dir)?;
do_uninstall(&config, &artifact_dir, &install_dir, &log).await?;
}
}

Expand Down
2 changes: 1 addition & 1 deletion sled-agent/src/common/underlay.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ pub fn find_nics() -> Result<Vec<AddrObject>, Error> {
Ok(addr_objs)
}

fn find_chelsio_links() -> Result<Vec<PhysicalLink>, Error> {
pub(crate) fn find_chelsio_links() -> Result<Vec<PhysicalLink>, Error> {
// TODO-correctness: This should eventually be determined by a call to
// `Dladm` to get the real Chelsio links on a Gimlet. These will likely be
// called `cxgbeN`, but we explicitly call them `netN` to be clear that
Expand Down
45 changes: 45 additions & 0 deletions sled-agent/src/illumos/dladm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

use crate::common::vlan::VlanID;
use crate::illumos::vnic::VnicKind;
use crate::illumos::zone::IPADM;
use crate::illumos::{execute, ExecutionError, PFEXEC};
use omicron_common::api::external::MacAddr;
use serde::{Deserialize, Serialize};
Expand Down Expand Up @@ -153,6 +154,50 @@ impl Dladm {
Ok(EtherstubVnic(ETHERSTUB_VNIC_NAME.to_string()))
}

// Return the name of the IP interface over the etherstub VNIC, if it
// exists.
fn get_etherstub_vnic_interface() -> Result<String, ExecutionError> {
let mut cmd = std::process::Command::new(PFEXEC);
let cmd = cmd.args(&[
IPADM,
"show-if",
"-p",
"-o",
"IFNAME",
ETHERSTUB_VNIC_NAME,
]);
execute(cmd)?;
Ok(ETHERSTUB_VNIC_NAME.to_string())
}

/// Delete the VNIC over the inter-zone comms etherstub.
pub(crate) fn delete_etherstub_vnic() -> Result<(), ExecutionError> {
// It's not clear why, but this requires deleting the _interface_ that's
// over the VNIC first. Other VNICs don't require this for some reason.
if Self::get_etherstub_vnic_interface().is_ok() {
let mut cmd = std::process::Command::new(PFEXEC);
let cmd = cmd.args(&[IPADM, "delete-if", ETHERSTUB_VNIC_NAME]);
execute(cmd)?;
}

if Self::get_etherstub_vnic().is_ok() {
let mut cmd = std::process::Command::new(PFEXEC);
let cmd = cmd.args(&[DLADM, "delete-vnic", ETHERSTUB_VNIC_NAME]);
execute(cmd)?;
}
Ok(())
}

/// Delete the inter-zone comms etherstub.
pub(crate) fn delete_etherstub() -> Result<(), ExecutionError> {
if Self::get_etherstub().is_ok() {
let mut cmd = std::process::Command::new(PFEXEC);
let cmd = cmd.args(&[DLADM, "delete-etherstub", ETHERSTUB_NAME]);
execute(cmd)?;
}
Ok(())
}

/// Returns the name of the first observed physical data link.
pub fn find_physical() -> Result<PhysicalLink, FindPhysicalLinkError> {
let mut command = std::process::Command::new(PFEXEC);
Expand Down
2 changes: 1 addition & 1 deletion sled-agent/src/illumos/zone.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ use crate::illumos::{execute, PFEXEC};
use omicron_common::address::SLED_PREFIX;

const DLADM: &str = "/usr/sbin/dladm";
const IPADM: &str = "/usr/sbin/ipadm";
pub const IPADM: &str = "/usr/sbin/ipadm";
pub const SVCADM: &str = "/usr/sbin/svcadm";
pub const SVCCFG: &str = "/usr/sbin/svccfg";
pub const ZLOGIN: &str = "/usr/sbin/zlogin";
Expand Down
1 change: 1 addition & 0 deletions sled-agent/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ mod storage_manager;
mod updates;

pub use illumos::zone;
pub use sled_agent::cleanup_networking_resources;

#[cfg(test)]
mod mocks;
Expand Down
Loading