Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sled agent] Fakes are better than mocks; get rid of mocks #2422

Closed
7 tasks
smklein opened this issue Feb 24, 2023 · 3 comments · May be fixed by #3442
Closed
7 tasks

[sled agent] Fakes are better than mocks; get rid of mocks #2422

smklein opened this issue Feb 24, 2023 · 3 comments · May be fixed by #3442
Assignees
Labels
Sled Agent Related to the Per-Sled Configuration and Management Testing & Analysis Tests & Analyzers

Comments

@smklein
Copy link
Collaborator

smklein commented Feb 24, 2023

Sled agent uses a lot of mocks to represent interactions with underlying resources. It uses a library named mockall to help with this interaction. We should use fakes instead.

  • svc interface (sled-agent/src/illumos/svc.rs)
  • dladm interface (sled-agent/src/illumos/dladm.rs)
  • fstyp interface (sled-agent/src/illumos/fstyp.rs)
  • zpool interface (sled-agent/src/illumos/zpool.rs)
  • zfs interface (sled-agent/src/illumos/zfs.rs)
  • some free functions (sled-agent/src/illumos/mod.rs)
  • instances (sled-agent/src/instance.rs)

Background

When the sled agent is provisioning a zone, it "mocks" access to zones. This enables unit tests to verify "hey, you would have created a zone here", but in reality, make no such call.

This is okay-ish for some types of unit tests, where we have a module foo which depends on a interface resource:

  • foo can be conditionally compiled with cfg(test) to use the mockResource
  • foo's tests can then use mockall's API to set expectations about calls to the mockResource

This is one such example of a test written in this style.

Problems

This testing strategy really breaks down when it gets nested. Suppose we have a module baz, which depends on foo and bar.

  • foo's dependency on mockResource still needs to be captured, so baz also needs to set up "expectation" calls in tests
  • The same applies to bar, and any other modules which contains mocks
  • This basically leaks implementation details, and makes tests really difficult to write

This is apparent for modules like the storage_manager, where we're interfacing with:

  • ZFS provisioning
  • Zpool provisioning
  • Zone provisioning

and all of them would need to be mocked to actually write tests.

Proposal

We should implement fakes wherever these mocks are being used. This will give us a more flexible interface for testing, and make dependencies on "global-ish resources" much more apparent.

Admittedly, doing so will require providing the interfaces to these modules as up-front objects. I propose doing so with Arc-bound trait objects, rather than generics, to keep things relatively simple. The cost of a single Box + Arc's refcounting should be trivial compared with the cost of "provisioning a filesystem" or "managing a zone".

@smklein smklein added Testing & Analysis Tests & Analyzers Sled Agent Related to the Per-Sled Configuration and Management labels Feb 24, 2023
@smklein smklein self-assigned this Feb 24, 2023
@davepacheco
Copy link
Collaborator

davepacheco commented Feb 24, 2023

Sorry for the ignorant question but what do we mean by "fakes"? I googled it and found https://stackoverflow.com/questions/346372/whats-the-difference-between-faking-mocking-and-stubbing ...which has several different definitions.

@smklein
Copy link
Collaborator Author

smklein commented Feb 24, 2023

My intention was "a struct that lies about implementing the functionality, often by just storing a HashMap of provided arguments".

According to that stackoverflow post, I'd say it's closest to:

Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production

For example:

  • We could define the access to Zpools as a trait, ZpoolInterface. Maybe this contains methods like "get zpool" and "provision zpool".
  • We can implement the real access to zpools as ZpoolAccess, which implements ZpoolInterface. It actually makes calls, either through a native library, or through the CLI, to provision real zpools.
  • We can also implement fake access to zpools as FakeZpoolAccess. This just contains a HashMap<Name of Zpool, Metadata about Zpool> which we maintain in memory. "provision zpool" adds an entry to the map, "get zpool" queries the hashmap.
  • Any clients that depend on interfacing with zpools can act on a dyn ZpoolInterface, to be compatible with either variant.

jgallagher added a commit that referenced this issue Mar 2, 2023
…2451)

This PR should make no substantive changes other than changes to
visibility, imports, etc., associated with moving:

* `sled_agent::hardware` -> `sled-hardware`
* `sled_agent::illumos` -> `illumos-utils`

The primary motivation of these is to allow `installinator` to also use
`sled-hardware` using the same hardware scraping/monitoring logic as
sled-agent. `sled_agent::hardware` had dependencies on
`sled_agent::illumos`, which led to extracting it too.

A handful of supporting but smaller changes:

* `illumos_utils::running_zone` had a dependency on
`sled_agent::opte::Port`, and it didn't seem right to pull all of `opte`
out with `illumos_utils`, so I broke this dependency by adding an
`OptePort` trait and making `RunningZone` and `InstalledZone` generic
over a `Port: OptePort` type.
* I moved `sled_agent::vlan` to `omicron_common::vlan`.
* `sled-agent`'s tests depend on mocks set up in what is now a separate
crate (`illumos_utils`). #2422 tracks replacing the mocks altogether; in
the meantime, 16c9017 is a workaround
that adds a `testing` feature to `illumos_utils` that builds the mocks.
`sled-agent` enables that feature when _it_ is being tested, allowing
the mocks to exist for its use.
smklein added a commit that referenced this issue Jul 10, 2023
…stead (#3427)

- Create a small fake NexusServer within Sled Agent
- Add some utilities for also creating a transient internal DNS server
pointing to the fake Nexus server
- Use both of these servers in the (fairly small) number of Sled Agent
tests
- Hopefully, in the future, we can use this fake to better test the Sled
Agent's interactions with Nexus

Part of #2422
smklein added a commit that referenced this issue Jul 14, 2023
- Removes all references to `rpool/zones`
- Exclusively stores zone filesystems on the U.2, under an encrypted
dataset
- Wipes these datasets from the U.2s when parsing them on sled agent
boot
- Tangentially related: Removes a handful of low-quality Sled Agent
tests, reliant heavily on mock interfaces (see #2422 , though we still
have work to improve this test coverage).

Fixes #3533
@smklein
Copy link
Collaborator Author

smklein commented Mar 7, 2024

Closing largely in favor of the option proposed in #5226

@smklein smklein closed this as completed Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sled Agent Related to the Per-Sled Configuration and Management Testing & Analysis Tests & Analyzers
Projects
None yet
2 participants