[sled agent] Fakes are better than mocks; get rid of mocks #2422

smklein · 2023-02-24T14:11:53Z

Sled agent uses a lot of mocks to represent interactions with underlying resources. It uses a library named mockall to help with this interaction. We should use fakes instead.

svc interface (sled-agent/src/illumos/svc.rs)
dladm interface (sled-agent/src/illumos/dladm.rs)
fstyp interface (sled-agent/src/illumos/fstyp.rs)
zpool interface (sled-agent/src/illumos/zpool.rs)
zfs interface (sled-agent/src/illumos/zfs.rs)
some free functions (sled-agent/src/illumos/mod.rs)
instances (sled-agent/src/instance.rs)

Background

When the sled agent is provisioning a zone, it "mocks" access to zones. This enables unit tests to verify "hey, you would have created a zone here", but in reality, make no such call.

This is okay-ish for some types of unit tests, where we have a module foo which depends on a interface resource:

foo can be conditionally compiled with cfg(test) to use the mockResource
foo's tests can then use mockall's API to set expectations about calls to the mockResource

This is one such example of a test written in this style.

Problems

This testing strategy really breaks down when it gets nested. Suppose we have a module baz, which depends on foo and bar.

foo's dependency on mockResource still needs to be captured, so baz also needs to set up "expectation" calls in tests
The same applies to bar, and any other modules which contains mocks
This basically leaks implementation details, and makes tests really difficult to write

This is apparent for modules like the storage_manager, where we're interfacing with:

ZFS provisioning
Zpool provisioning
Zone provisioning

and all of them would need to be mocked to actually write tests.

Proposal

We should implement fakes wherever these mocks are being used. This will give us a more flexible interface for testing, and make dependencies on "global-ish resources" much more apparent.

Admittedly, doing so will require providing the interfaces to these modules as up-front objects. I propose doing so with Arc-bound trait objects, rather than generics, to keep things relatively simple. The cost of a single Box + Arc's refcounting should be trivial compared with the cost of "provisioning a filesystem" or "managing a zone".

The text was updated successfully, but these errors were encountered:

davepacheco · 2023-02-24T17:11:13Z

Sorry for the ignorant question but what do we mean by "fakes"? I googled it and found https://stackoverflow.com/questions/346372/whats-the-difference-between-faking-mocking-and-stubbing ...which has several different definitions.

smklein · 2023-02-24T17:22:27Z

My intention was "a struct that lies about implementing the functionality, often by just storing a HashMap of provided arguments".

According to that stackoverflow post, I'd say it's closest to:

Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production

For example:

We could define the access to Zpools as a trait, ZpoolInterface. Maybe this contains methods like "get zpool" and "provision zpool".
We can implement the real access to zpools as ZpoolAccess, which implements ZpoolInterface. It actually makes calls, either through a native library, or through the CLI, to provision real zpools.
We can also implement fake access to zpools as FakeZpoolAccess. This just contains a HashMap<Name of Zpool, Metadata about Zpool> which we maintain in memory. "provision zpool" adds an entry to the map, "get zpool" queries the hashmap.
Any clients that depend on interfacing with zpools can act on a dyn ZpoolInterface, to be compatible with either variant.

…2451) This PR should make no substantive changes other than changes to visibility, imports, etc., associated with moving: * `sled_agent::hardware` -> `sled-hardware` * `sled_agent::illumos` -> `illumos-utils` The primary motivation of these is to allow `installinator` to also use `sled-hardware` using the same hardware scraping/monitoring logic as sled-agent. `sled_agent::hardware` had dependencies on `sled_agent::illumos`, which led to extracting it too. A handful of supporting but smaller changes: * `illumos_utils::running_zone` had a dependency on `sled_agent::opte::Port`, and it didn't seem right to pull all of `opte` out with `illumos_utils`, so I broke this dependency by adding an `OptePort` trait and making `RunningZone` and `InstalledZone` generic over a `Port: OptePort` type. * I moved `sled_agent::vlan` to `omicron_common::vlan`. * `sled-agent`'s tests depend on mocks set up in what is now a separate crate (`illumos_utils`). #2422 tracks replacing the mocks altogether; in the meantime, 16c9017 is a workaround that adds a `testing` feature to `illumos_utils` that builds the mocks. `sled-agent` enables that feature when _it_ is being tested, allowing the mocks to exist for its use.

…stead (#3427) - Create a small fake NexusServer within Sled Agent - Add some utilities for also creating a transient internal DNS server pointing to the fake Nexus server - Use both of these servers in the (fairly small) number of Sled Agent tests - Hopefully, in the future, we can use this fake to better test the Sled Agent's interactions with Nexus Part of #2422

- Removes all references to `rpool/zones` - Exclusively stores zone filesystems on the U.2, under an encrypted dataset - Wipes these datasets from the U.2s when parsing them on sled agent boot - Tangentially related: Removes a handful of low-quality Sled Agent tests, reliant heavily on mock interfaces (see #2422 , though we still have work to improve this test coverage). Fixes #3533

smklein · 2024-03-07T23:04:16Z

Closing largely in favor of the option proposed in #5226

smklein added Testing & Analysis Tests & Analyzers Sled Agent Related to the Per-Sled Configuration and Management labels Feb 24, 2023

smklein self-assigned this Feb 24, 2023

jgallagher mentioned this issue Feb 28, 2023

Extract sled-hardware and illumos-utils crates from sled-agent #2451

Merged

This was referenced Jun 26, 2023

[sled-agent] Avoid mocking "NexusClient" - use a fake Nexus server instead #3427

Merged

[sled-agent] Create an "Executor", which intercepts requests through std::process::Command #3442

Open

smklein mentioned this issue Jul 11, 2023

[sled agent] Store zone filesystems on U.2s, not the ramdisk #3557

Merged

smklein mentioned this issue Jul 31, 2023

Unit tests for DumpSetup #3788

Merged

smklein mentioned this issue Mar 7, 2024

Sled Agent x Falcon: Use VMMs for Sled Agent testing #5226

Open

9 tasks

smklein closed this as completed Mar 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sled agent] Fakes are better than mocks; get rid of mocks #2422

[sled agent] Fakes are better than mocks; get rid of mocks #2422

smklein commented Feb 24, 2023 •

edited

Loading

davepacheco commented Feb 24, 2023 •

edited

Loading

smklein commented Feb 24, 2023

smklein commented Mar 7, 2024

[sled agent] Fakes are better than mocks; get rid of mocks #2422

[sled agent] Fakes are better than mocks; get rid of mocks #2422

Comments

smklein commented Feb 24, 2023 • edited Loading

Background

Problems

Proposal

davepacheco commented Feb 24, 2023 • edited Loading

smklein commented Feb 24, 2023

smklein commented Mar 7, 2024

smklein commented Feb 24, 2023 •

edited

Loading

davepacheco commented Feb 24, 2023 •

edited

Loading