Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order specifications for hooks #4168

Merged
merged 12 commits into from
Nov 9, 2020

Conversation

rustyrussell
Copy link
Contributor

@rustyrussell rustyrussell commented Oct 29, 2020

This allows hooks to specify they must be before/after some other plugins. Either by basename or full pathname.

As our plugin ecosystem grows and gets more sophisticated, such conflicts will inevitably occur where multiple plugins want to ensure they are called in certain order.

We could implement a priority number system, but that's usually just window-dressing on what people really want (though, unlike this, would allow you to specify that you must be first, or last: we could add wildcards if we wanted to).

Getting this in now means that after a few releases, plugin authors will simply be able to assume its existence.

Fixes: #4005 (thanks @m-schmoock for pointing that out!)

It needs to use a non-clashing name for the internal var.

Signed-off-by: Rusty Russell <[email protected]>
@m-schmoock
Copy link
Collaborator

This addresses #4005

Copy link
Contributor

@niftynei niftynei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 0c1d19b

doc/PLUGINS.md Outdated
"after" arrays of plugin names, which control what order they will be
called in. If a plugin name is unknown, it is ignored.

if the plugins names are unknown, they are ignored, otherwise if the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this repeats the last line of the previous paragraph? i'd remove one.

json_tok_full_len(t),
json_tok_full(buffer, t));
name = json_strdup(tmpctx, buffer, nametok);
beforetok = json_get_member(buffer, t, "before");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it took me a second to fully grasp the implication of 'before' and 'after'. Renaming them to "run_before" and "run_after" might be more immediately obvious?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more obvious in context, than when reading in a patch, e.g.:

"hooks": [
    {"name": "htlc_accepted",
     "before": ["someplugin.py"],
     "after": ["foo.py"]}
]

@rustyrussell rustyrussell added the Optech Make Me Famous! Look! Look! Look! COOL NEW FEATURE! label Oct 29, 2020
The next patch will use these to order the hooks.

Signed-off-by: Rusty Russell <[email protected]>
Changelog-Added: plugins: hooks can now specify that they must be called 'before' or 'after' other plugins.
@rustyrussell rustyrussell force-pushed the guilt/order-for-hooks branch from 0c1d19b to 7dd4c39 Compare October 30, 2020 01:13
@m-schmoock
Copy link
Collaborator

m-schmoock commented Oct 30, 2020

concept ack

@rustyrussell as you wrote in 4005

3. Secondary sort by command line order, then config order.

It is also very helpful to give the user an option for plugin ordering (unless otherwise specified by the devs via before/after) by checking commandline and config order of plugins. The rationale behind is it the following: Before and after list are to be maintained by the plugin developers. Before that happens, a user typically has the problem first and needs to wait for the devs to to make the changes so his specific setup works. Since the devs will often not know all the plugin combinations out in the wild this can take a while or, in the worst case, never happens.

So I think we might also add the commandline/config order as 'last resort/default' if no before/after ordering has been specified for the plugins. For that we need a stable sorting algorithm and feed it with the initial order of plugins from commandline/config.

@rustyrussell
Copy link
Contributor Author

The current implementation has this effect, in fact. Technically it's not quite a stable sort, as dependencies can pull in things out of order, but we could fix that (I doubt anyone will notice, though).

For example, assume plugins in alphabetical order (they're in directory-sort order usually BTW). A has to be before D, and B has to be before C.

In this case, algorithm will go A B D C. Easy to fix though.

@rustyrussell
Copy link
Contributor Author

OK, fixed to make the default ordering equal to the specified one.

Also, removed a problem where we'd warn about missing dependencies since we evaluated them as we loaded. This was more invasive, but I think simplifies the code a little anyway.

@rustyrussell rustyrussell requested a review from niftynei November 2, 2020 02:39
@m-schmoock
Copy link
Collaborator

Thanks @rustyrussell

Super wired, I stress tested your changes a little, but it seems there is an issue with the default commandline/config order (unless otherwise by before/after specified).

How to reproduce

  • check out ece24e5
  • compile
  • apply a test change, that now should work (see below)
  • run the tests in parallel: py.test --count 50 -x -n 16 -v tests/test_plugin.py -k test_openchannel_hook_chaining

What happens

The flaky test can fail. When we look in the logs of l2 we see that the 'reject for a reason' hook is not called but 'reject by design', which should never happen if correct default order would be applied.

The patch for the test that uses the new feature

diff --git a/tests/test_plugin.py b/tests/test_plugin.py
index d2628cd25..9f35a4dc5 100644
--- a/tests/test_plugin.py
+++ b/tests/test_plugin.py
@@ -606,24 +606,15 @@ def test_openchannel_hook_chaining(node_factory, bitcoind):
     l1, l2 = node_factory.line_graph(2, fundchannel=False, opts=opts)
     l1.fundwallet(10**6)
 
-    hook_msg = "openchannel_hook rejects and says '"
     # 100005sat fundchannel should fail fatal() for l2
     # because hook_accepter.py rejects on that amount 'for a reason'
     with pytest.raises(RpcError, match=r'They sent error channel'):
         l1.rpc.fundchannel(l2.info['id'], 100005)
 
-    # Note: hook chain order is currently undefined, because hooks are registerd
-    #       as a result of the getmanifest call, which may take some random time.
-    #       We need to workaround that fact, so test can be stable
-    correct_order = l2.daemon.is_in_log(hook_msg + "reject for a reason")
-    if correct_order:
-        assert l2.daemon.wait_for_log(hook_msg + "reject for a reason")
-        # the other plugin must not be called
-        assert not l2.daemon.is_in_log("reject on principle")
-    else:
-        assert l2.daemon.wait_for_log(hook_msg + "reject on principle")
-        # the other plugin must not be called
-        assert not l2.daemon.is_in_log("reject for a reason")
+    hook_msg = "openchannel_hook rejects and says '"
+    assert l2.daemon.wait_for_log(hook_msg + "reject for a reason")
+    # the third plugin must now not be called anymore
+    assert not l2.daemon.is_in_log("reject on principle")
 
     # 100000sat is good for hook_accepter, so it should fail 'on principle'
     # at third hook openchannel_reject.py

@niftynei niftynei added this to the v0.9.2 milestone Nov 2, 2020
@rustyrussell
Copy link
Contributor Author

Argh, good catch: final patch actually fixes the problem, and for some reason I didn't push it!

@rustyrussell
Copy link
Contributor Author

And I turned your patch into a commit, credited you and appended. Thanks!

@m-schmoock
Copy link
Collaborator

works like a charm

@m-schmoock
Copy link
Collaborator

m-schmoock commented Nov 3, 2020

@rustyrussell regarding IRC chat:

00:22 < rusty> m-schmoock: how do you feel about getting rid of all those single hooks? Or at least as many of them as possible...

Can you elaborate on this?

I was thinking that it would be sufficient to have the before and after lists on a per plugin level and not per hook. It would reduce (configuration) complexity. Having it on a per hook basis brings the option to have two different plugins with two (same) hooks each order their execution differently depending on what hook is called. This seems a bit overkill. Having plugin order rather than hook order would just prioritize one plugin over another, which will likely do the job.

Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor questions for my own understanding, and a tiny python bug, otherwise a very solid PR 👍

lightningd/plugin_hook.c Outdated Show resolved Hide resolved
lightningd/plugin_hook.c Outdated Show resolved Hide resolved
Comment on lines 411 to 412
before: List[str] = [],
after: List[str] = []) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is dangerous: never use a mutable instance as the default argument. They'll be shared between invocations.

https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments

This should be before: List[str] = None and then check for None in the body.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice catch, thanks!

Comment on lines +153 to +160
/* If this was last one manifests were waiting for, handle deps */
if (p->plugin_state == AWAITING_GETMANIFEST_RESPONSE)
check_plugins_manifests(p->plugins);

/* If this was the last one init was waiting for, handle cmd replies */
if (p->plugin_state == AWAITING_INIT_RESPONSE)
check_plugins_initted(p->plugins);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a weird place to perform these checks. We look at a random plugin being destroyed and if that was in a specific state we assume it must have been the last one in this state? We should kick this up to the set of plugins and perform the check there after removing the plugin to be destroyed. No need to replicate the last-one-turns-off-the-light test here :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we've already unlisted the plugin at this point. We could, and probably should simply call check_plugins_xxx() unconditionally here.

I wonder if we should have a plugin_set_state() which calls these internally. Less efficient, but clearer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the payment_continue function that is used to continue in a state machine? I'm all for it ^^

Comment on lines 447 to 449
tal_resize(s, s_num + 1);
memmove(*s + pos+1, *s + pos, sizeof(**s) * (s_num - pos));
(*s)[pos] = n;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow why this is working: we're striking dependencies off of the graph node until we finally find that it has no more, then add it to the list. However, before this change we added it to the end of the list, which was always safe (no dependency could jump over us). After this change we suddenly use insertion sort on the index at a potentially earlier position in the list, meaning we could end up being inserted before one of our dependencies. What am I missing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right!

We should instead gather all the things we are going to add from one edge, and order those then add as a group. That means if B and C depend on A, we append A, then sort (B,C) and append those.

I will write a test which triggers this first...

lightningd/plugin_hook.c Outdated Show resolved Hide resolved
@rustyrussell
Copy link
Contributor Author

I ended up reworking and thus simplifying the algo. It's less efficient (O(N^2)) but nobody cares.

Comment on lines +441 to +444
before: List[str] = None,
after: List[str] = None) -> JsonDecoratorType:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now becomes an Optional[List[str]]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for forgetting this in my first round of comments :-)

Comment on lines 434 to 439
if before:
method.before = before
if after:
method.after = after
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should always set these, otherwise we'd have to check with hasattr whether the hook has that field.

method.before = [] if before else before
method.after = [] if after else after

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, mypy didn't like that, so went for an open-coded variant.

struct hook_node {
bool finished;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the semantics of finished? It's only used in sorting so I'd assume it means that it is in its final position.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Included in sorted results. I'll add a comment

if (graph[i].num_incoming != 0)
continue;
if (!best
|| best->hook->plugin->index > graph[i].hook->plugin->index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're never moving the plugin in the graph, so under no circumstance should this ever be true (a later iteration should always have a higher index). Since that's the case I think we can remove the index altogether, and break/return after finding a first best

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's making an assumption that graph is in index order. It (I'm fairly sure) starts that way, but can be perturbed by previous calls to plugin_hook_make_ordered. If plugin B says it has to be before A and after C, we'll go C B A. If B gets removed, we should return to A C.

Does anyone care? Probably not, but this way is clear, at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further thought, if we define it as "minimum perturbation to achieve required ordering" which is a totally reasonable goal, we can indeed do this simplification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Nope. Hooks start in "who returned from getmanifest" order which is not the same as "order specified by config" order.

@m-schmoock
Copy link
Collaborator

Hm, I restarted a travis flake which I couldn't reproduce locally also not by running it 100 times and parallel: tests/test_plugin.py::test_plugin_command

It was complaining about unclean node shutdown, but it succeeded upon first restart/retry.

@m-schmoock
Copy link
Collaborator

@rustyrussell still (sorry for bugging):

regarding IRC chat:

00:22 < rusty> m-schmoock: how do you feel about getting rid of all those single hooks? Or at least as many of them as possible...

Can you elaborate on this?

I was thinking that it would be sufficient to have the before and after lists on a per plugin level and not per hook. It would reduce (configuration) complexity. Having it on a per hook basis brings the option to have two different plugins with two (same) hooks each order their execution differently depending on what hook is called. This seems a bit overkill. Having plugin order rather than hook order would just prioritize one plugin over another, which will likely do the job.

@rustyrussell rustyrussell force-pushed the guilt/order-for-hooks branch from 689c8aa to 7defce9 Compare November 9, 2020 04:25
@rustyrussell
Copy link
Contributor Author

@rustyrussell still (sorry for bugging):

regarding IRC chat:

00:22 < rusty> m-schmoock: how do you feel about getting rid of all those single hooks? Or at least as many of them as possible...

Can you elaborate on this?

We should make it so every hook can be registered by multiple plugins.

I was thinking that it would be sufficient to have the before and after lists on a per plugin level and not per hook. It would reduce (configuration) complexity. Having it on a per hook basis brings the option to have two different plugins with two (same) hooks each order their execution differently depending on what hook is called. This seems a bit overkill. Having plugin order rather than hook order would just prioritize one plugin over another, which will likely do the job.

I agree, at the moment. But hooks are the only place where plugins interact, so I'd rather see explicitly what they care about. If we ever get order conflicts, it's easier to see what they want when they have priorities on specific hooks.

@rustyrussell rustyrussell force-pushed the guilt/order-for-hooks branch from 7defce9 to fe6b1e0 Compare November 9, 2020 05:43
@rustyrussell
Copy link
Contributor Author

Trivial rebase to fold in fixups and add a comment on finished var.

And use that to add simple tests.

Signed-off-by: Rusty Russell <[email protected]>
rustyrussell and others added 7 commits November 9, 2020 19:50
Now both python and c libraries are updated, we can officially
deprecate the old form.

Signed-off-by: Rusty Russell <[email protected]>
Changelog-Deprecated: plugins: hooks should now be specified using objects, not raw names.
…rtup.

We fail this, because we check dependencies as they come in.

Signed-off-by: Rusty Russell <[email protected]>
This means we need to stop at this stage even in the runtime-loaded
case.

Signed-off-by: Rusty Russell <[email protected]>
…hange.

Suggested-by: @mschmook
Signed-off-by: Rusty Russell <[email protected]>
We previously registered hooks up in who-replies-to-getmanifest-first
order, but then if any had dependencies it would scatter that order.

This allows users to manually set dependencies developers have
forgotten by specifying the plugins manually in their configuration or
cmdline.  This was an excellent consideration by @mschmook.

Signed-off-by: Rusty Russell <[email protected]>
I think this is what Travis is having an issue with, but it work
fine locally.

Signed-off-by: Rusty Russell <[email protected]>
@rustyrussell rustyrussell force-pushed the guilt/order-for-hooks branch from fe6b1e0 to 7d41997 Compare November 9, 2020 09:26
@m-schmoock
Copy link
Collaborator

00:22 < rusty> m-schmoock: how do you feel about getting rid of all those single hooks? Or at least as many of them as possible...

Can you elaborate on this?

We should make it so every hook can be registered by multiple plugins.

absolutely yes, I missed the point that you were referring to 'unchained' hooks.

@@ -678,14 +678,25 @@ declares that it'd like to be consulted on what to do next for certain
events in the daemon. A hook can then decide how `lightningd` should
react to the given event.

When hooks are registered, they can optionally specify "before" and
"after" arrays of plugin names, which control what order they will be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question about 'plugin names': Is this same as for cli plugin start/stop commands, meaning either full path (useless for this feature) or basename? If so, we may note that the plugin name includes its extension, meaning full filename.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, filename or full path name.

@m-schmoock
Copy link
Collaborator

ACK 7d41997

@niftynei niftynei merged commit 4745e7e into ElementsProject:master Nov 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Optech Make Me Famous! Look! Look! Look! COOL NEW FEATURE!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plugin hook ordering
4 participants