-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add methods for inspecting Saga DAG #67
Conversation
src/dag.rs
Outdated
// | ||
// This constant helps callers iterate over the saga DAG while safely | ||
// ignoring these indices. | ||
const START_AND_END_NODES: usize = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried this abstraction exposes implementation details that could reasonably change in the future. Dumb examples: we don't have to have "start" and "end" nodes, and we could choose to insert other kinds of internal nodes.
What about providing an iterator over the nodes instead? We could have the iterator skip various internal nodes if we want (e.g., "start", "end").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, this constant is not pub
- I just wanted to be really explicit about the magic "2" number.
If you'd prefer for me to expose an iterator, I can, but it seems kinda equivalent to say "you can access indices [0, N)" (where "N" is what we expose for node count") vs "here is an iterator that returns the integers [0, N)". Either way, the thing exposed in the pub
interface does not expose the start / end nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call about the const
not being pub
. That does help.
It still seems a little brittle to me. I think this change creates an implicit dependency that saga_dag.start_node == saga_dag.graph.node_count() - 2
and saga_dag.end_node == saga_dag.graph.node_count() - 1
, as well as that there are no other internal nodes that we might want to hide from users. Previously we've avoided the assumption about the node indexes -- that's why saga_dag
have start_node
and end_node
fields (rather than assuming what indexes those nodes would be). The indexes these nodes get depends on what order you add them to the graph, which I think changed relatively recently.
Relatedly, I suspect we probably should hide SubsagaStart
nodes from users: they're not actions and they don't produce any output (and I don't think it makes sense for them to fail). This has the advantage that the name
wouldn't need to be an Option
. All of the user-visible node types have names.
It feels like an iterator would be a lot cleaner but maybe I'm missing a downside?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, in cf98df1 I implemented this as an iterator and updated the test. LMK what you think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this a lot!
InternalNode
is only sort of exposed today. I don't think it's supposed to be pub
and it doesn't appear exported by src/lib.rs so I think a consumer couldn't name the type produced by this iterator. That seems not ideal. We also don't really want to expose it -- "internal" here does reflect "internal to Steno", as opposed to Node
.
I'm not sure what'd be better though, short of a whole other "node" type here.
|
Builds on #1835 and oxidecomputer/steno#67 Adds a test for the "instance create" saga's unwind safety by: - Using a saga which attempts to create a network interface, allocate an external IP, and attach a disk - Failing after *every single intermediate node* in the saga - Verifying that no intermediate resources remain after any of the attempted executions
Provides a simple API for instructing a node to "execute twice". This provides a "bare-minimum" helper API for testing idempotency within a saga. When combined with #67 - which was used to test unwind safety - it should be possible to test that all actions / undo actions within a saga are idempotent, at least across being called twice. Part of #31
This PR adds some methods for making DAG access easier.
For example, to test "injecting failures on each node in a saga", one could iterate on nodes from
0..dag.get_node_count()
, and invoke the SEC'ssaga_inject_error
method.Additionally, to query for the name / label information of that node, a caller could invoke
dag.get_node_name(...)
ordag.get_node_label(...)
.I've validated that these methods provide useful info in oxidecomputer/omicron#1843 - let me know if you want me to add more specific tests for anything.