Events triggered by actions #17

neighthan · 2022-06-22T15:25:20Z

In your paper, it says "effects become instantaneously available to events, without the epsilon separation." I assumed this was true for the effects of actions as well as events; is that correct? If so, SMTPlan doesn't seem to model this correctly. E.g. if I run SMTPlan with -c 3 then I get variables like (x)0_0, (x)0_1, and (x)0_2. The effects of actions are all on the (x)0_2 variables, and the preconditions of the actions are required on the (x)0_1 variables. So no matter how large I set c, I can't get events which occur simultaneously (in the same happening) but after actions. Unless I've missed something?

Is there any way to work around this? I could change the separation constraints between happenings to just be di >= 0 instead of di >= 1 / 10. This seems like it could fix things for events that are simultaneous with but after actions (by having the next happening occur immediately, with di == 0), though I haven't tried it, but it would remove the epsilon separation that there's supposed to be for actions ("the values of such instantaneous effects [of actions] can be exploited to support other actions only after a small amount of time"). I haven't read (Fox & Long, 2003); are there caveats to removing this separation? Is it just there for practical reasons (e.g. easier to write planning algorithms) or something more critical?

The text was updated successfully, but these errors were encountered:

neighthan · 2022-06-22T16:10:01Z

After a little more thought, I don't think allowing the happenings to have zero durations would solve this problem either (the preconditions of the event would already be satisfied, so there wouldn't be a zero-crossing to force the next happening to occur at the same time as the previous one). I think the easiest solution then would be to have actions occur at the start of happenings instead of at the end. This has its own issues (some valid plans won't be possible anymore because the action might depend on events happening first), but it is safer (invalid plans where events fail to occur after actions won't be possible, at least with a large enough bound on the event cascade). At the cost of longer planning times, I suppose you could also have events both before and after actions within a happening; that would also introduce another hyperparameter, since you'd need to bound the pre- and post-action event cascades.

m312z · 2022-06-23T11:24:31Z

Hello Nathan,

It's been a while since I've looked at the code, but I thought that actions checked for preconditions in the first "layer" of the happening (x)0_0, and applied their effects in the second (x)0_1, and then events chain from there. This is what the paper suggests (H9-10), but the implementation could be wrong there.

I don't think this would cause plans to not be possible for the reason you suggest: if an action requires an event to happen before the action is applied, they would be in separate happenings, with epsilon between them. I found the best place to get a description of the semantics of plan execution was in the VAL papers, where the steps for checking the validity of a happening are described in detail.

One reason for epsilon separation is practical - you cannot observe event "e" has occurred and simultaneously execute action "a". There will always be some time taken to perform the turnaround of observation/action. For example, if the plan [a1,a2] can only be successfuly executed with an epsilon separation of 0.01, but it takes the dispatching code 0.1 seconds to see that the first action has been completed and send the next, then it is not possible to execute unless you could see 0.09 into the future. Setting epsilon to zero would require some amount of precognition.

Best regards,
Michael

DerekLong101 · 2022-06-23T12:07:47Z

Michael has already commented on this, but I just wanted to add to the discussion of separation of application of actions from the happenings that make them true. Firstly, I'd like to be precise about use of language - happenings never have duration. They are instantaneous state transitions. I find it helpful to think of things in the form of timed automata (which is also the basis of the semantics) - in a finite state machine, the world is always in some state (one of the states of the FSM), and transitions between states take zero time. So how does time pass? By the world resting in some state. I like to imagine this as a third dimension to a FSM laid out in 2 dimensions - the states can be imagined as columns progressing in time, upwards (for convenience). So, a transition will exit from one of these columns at a time (a position up the column) and enter another column at the same level (same time). In timed automata, flow conditions can be active in these states - numeric variables can be changing under some continuous process effect defined by differential equations - and if one of these variables passes a critical threshold it can be that the state is no longer applicable and the state transitions. So, there are some issues we need to consider. The first is that I would like to be able to say, at any time point, whether a proposition P is true or false. If I apply an action in a state in which P is true, and it deletes P, is P true or false at the instant of application? The resolution Maria Fox and I proposed in the paper (the one you have not read 🙂 ) is that preconditions must be true in an open interval ending at the point of application and then the effect is true at the point of application. So, P is true in an open interval before application of the action and false at the instant of application. Now, a second issue: if I want to apply an action that has precondition (not P), when can I apply it? You might say "immediately on deletion of P". But the requirement that the precondition be satisfied for an open interval before applying the action means that there must be a non-zero separation between the application of the first action and the application of the second. This can clearly be any value > 0. However, now a third issue: we defined the language as part of a competition in which we wanted to compare plan quality. To do that, we have to compare plans - and we wanted plans with short total duration. But, now, if I plan to separate the two actions by d and you produce the exact same plan and separate them by d/2, you win, but that decision was arbitrary. We wanted to avoid that problem and so placed a bound on how small d could be. In practical terms, now plan validation will be able to work with arbitrarily small values of d, because numeric accuracy will defeat us (or storage capacity if we use arbitrary precision numbers). There is a new complication: it is reasonable that there is some choice to be made about how to position actions on the timeline, while respecting this requirement for separation, because we assume that actions represent a choice to act on the part of some executive. But events in the world don't appear to behave like that. At least intuitively, if the condition is met, then the event triggers immediately - no choice. There is no pause while the world decides when to trigger it. That does indeed cause some difficulties - it conflicts with our goal to be able to say what the status of a condition is at any time (and to see how the intuition leads to paradoxes, consider a light that is automatically controlled by a light sensitive sensor - it comes on when the sensor detects that it is dark, but goes off when the sensor detects it is light - if we place the sensor next to the light, what state will it be in?). This paradox and those like it caused us some difficulty (and still does!). I think that it is interesting that physics identifies Planck time - a shortest measurable unit of time. From a practical point of view, I think we recognise that a paradox like the light is only a paradox in an idealised world - in the real world, sensors do take time to react; lights do take time to switch state. Time passes as the light reacts to the sensor - does the light end up on or off - well probably we end up in a state where the bulb is destroyed by the rapid cycling, or, more likely, the cycling settles to a steady state that is on a period that sustains an average light level corresponding to the threshold of sensitivity for the sensor to say the light is on or off. This is far more complex that we want for our planning models and far too difficult to model. And, most importantly, it simply does not correspond to the way we reason about the world using our (admittedly naive) models of the way that events and processes work, yet that reasoning is what we are really interested in capturing. This leads to the kind of difficulties I have already alluded to - and we have some fairly arbitrary ways to fix them (eg limiting the length of chains of events that cascade from a state change). However, as you can see, there is fundamental reason why we think events and actions are different and that actions, which are under the control of an executive, are best modelled as requiring the precondition to be true for some non-zero period before they can be executed. If we did not require that, in what sense could one say that the precondition of an action must be satisfied before the action is applied? If it ends up being possible to apply an action that deletes P and another that reasserts P at the same instant, then was P true, false, or undefined at that time? What is the state of P afterwards? How do I ensure that is the case if I actually execute this plan? How would I execute it? I hope these comments help to motivate our thinking in this area.

…

________________________________ From: Nathan Hunt ***@***.***> Sent: 22 June 2022 16:25 To: KCL-Planning/SMTPlan ***@***.***> Cc: Subscribed ***@***.***> Subject: [KCL-Planning/SMTPlan] Events triggered by actions (Issue #17) In your paper, it says "effects become instantaneously available to events, without the epsilon separation." I assumed this was true for the effects of actions as well as events; is that correct? If so, SMTPlan doesn't seem to model this correctly. E.g. if I run SMTPlan with -c 3 then I get variables like (x)0_0, (x)0_1, and (x)0_2. The effects of actions are all on the (x)0_2 variables, and the preconditions of the actions are required on the (x)0_1 variables. So no matter how large I set c, I can't get events which occur simultaneously (in the same happening) but after actions. Unless I've missed something? Is there any way to work around this? I could change the separation constraints between happenings to just be di >= 0 instead of di >= 1 / 10. This seems like it could fix things for events that are simultaneous with but after actions (by having the next happening occur immediately, with di == 0), though I haven't tried it, but it would remove the epsilon separation that there's supposed to be for actions ("the values of such instantaneous effects [of actions] can be exploited to support other actions only after a small amount of time"). I haven't read (Fox & Long, 2003); are there caveats to removing this separation? Is it just there for practical reasons (e.g. easier to write planning algorithms) or something more critical? — Reply to this email directly, view it on GitHub<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FKCL-Planning%2FSMTPlan%2Fissues%2F17&data=05%7C01%7Cderek.long%40kcl.ac.uk%7Ca68a018a2f7f4fe03dfa08da546373cf%7C8370cf1416f34c16b83c724071654356%7C0%7C0%7C637915083359642234%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=l22m7AIZ14gD884Xr3CbTJO11yaWNNY3h5%2Fyd5huNmA%3D&reserved=0>, or unsubscribe<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAB2KOVHJSNALQOJC7ACA6N3VQMV63ANCNFSM5ZQVAAEQ&data=05%7C01%7Cderek.long%40kcl.ac.uk%7Ca68a018a2f7f4fe03dfa08da546373cf%7C8370cf1416f34c16b83c724071654356%7C0%7C0%7C637915083359798480%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=60mOaZX%2ByerjHEw2y8ekgyKv6NzGbwA3nH7b9tkvvgY%3D&reserved=0>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

neighthan · 2022-07-14T15:16:56Z

Thanks for your responses, Michael and Derek.

To the point of why an epsilon separation is required at all, you explained that well and I see the reasoning behind it for a real system (with sensing and processing delays).

I don't think this would cause plans to not be possible for the reason you suggest: if an action requires an event to happen before the action is applied, they would be in separate happenings, with epsilon between them.

I was thinking that there may be a case where a plan only works if the action could be taken at the instant the event occurs, but I agree this doesn't really have practical value since you can't ever be that precise, and you could adjust epsilon to however precise you think you could be.

However, I still think there's an issue in the codebase for the preconditions / effects of actions.

It's been a while since I've looked at the code, but I thought that actions checked for preconditions in the first "layer" of the happening (x)0_0, and applied their effects in the second (x)0_1, and then events chain from there. This is what the paper suggests (H9-10), but the implementation could be wrong there.

This is what I thought as well, which is why I was surprised. My original example is a bit more complicated than would be worth looking at, but here's a simpler one:

domain.pddl

(define (domain test)

(:requirements :strips :numeric-fluents :negative-preconditions)

(:predicates
  (stopped)
)

(:action go
  :parameters ()
  :precondition (stopped)
  :effect (and
    (not (stopped))
  )
)

)

problem.pddl

(define (problem test1) (:domain test)

(:init
  (stopped)
)

(:goal (and
  (not (stopped))
))

)

If I pass these to SMTPlan with 2 happenings and a bound of 3 for the "event cascade" (not sure if that's the right terminology), I get back the following constraints (I parsed the raw smtlib output of SMTPlan with z3 so it's more legible):

t0 == 0,
d0 >= 1/10,
t1 == t0 + d0,
t1 > t0,
d1 >= 1/10,
(stopped)0_0,
(go)0_dur == 0,
(go)1_dur == 0,
Implies((go)0_sta, Not((stopped)0_3)),
Implies((go)0_sta, (stopped)0_2),
Implies((go)1_sta, Not((stopped)1_3)),
Implies((go)1_sta, (stopped)1_2),
Implies((stopped)0_1, Or((stopped)0_0)),
Implies(Not((stopped)0_1), Or(Not((stopped)0_0))),
Implies((stopped)0_2, Or((stopped)0_1)),
Implies(Not((stopped)0_2), Or(Not((stopped)0_1))),
Implies((stopped)0_3, Or((stopped)0_2)),
Implies(Not((stopped)0_3), Or(Not((stopped)0_2), (go)0_sta)),
Implies((stopped)1_1, Or((stopped)1_0)),
Implies(Not((stopped)1_1), Or(Not((stopped)1_0))),
Implies((stopped)1_2, Or((stopped)1_1)),
Implies(Not((stopped)1_2), Or(Not((stopped)1_1))),
Implies((stopped)1_3, Or((stopped)1_2)),
Implies(Not((stopped)1_3), Or(Not((stopped)1_2), (go)1_sta)),
Implies((stopped)1_0, Or((stopped)0_3)),
Implies(Not((stopped)1_0), Or(Not((stopped)0_3))),
Not((stopped)1_3)

You'll notice constraints like Implies((go)0_sta, Not((stopped)0_3)) and Implies((go)0_sta, (stopped)0_2), so the effects of the action seem to be on the last "layer" of the happening and the preconditions on the penultimate layer. Am I doing anything wrong here / is there any easy way to fix this? In my environment, this is allowing the agent to pass through obstacles sometimes because the collision event which should occur after a move action can't trigger in the same happening, and by the next happening the agent has moved past the obstacle's boundary (there's probably a way I can rewrite the environment to work around this, but it seemed to be a discrepancy between the code and the paper too).

m312z · 2022-07-22T08:54:21Z

As a quick check, if you increase the bounds on the length of the even chain, is it still the final two "layers"? Just in case it is somehow hard-coded (wrongly) to be 2/3. I think this is a fairly easy fix if you can modify the source code. I'll take a look now.

m312z · 2022-07-22T09:04:40Z

Hello again,

The offending constraints are in this switch statement here:

SMTPlan/SMTPlan/src/EncoderHappening.cpp

Line 999 in b085bd7

case ENC_ACTION_CONDITION:

For at start, end, and overall cases you will see "opt->cascade_bound-2" used to specify the event layer in which the condition should be true.

From here:

SMTPlan/SMTPlan/src/EncoderHappening.cpp

Line 1172 in b085bd7

case ENC_SIMPLE_ACTION_EFFECT:

You can see a similar thing with action effects, using "opt->cascade_bound-1".

If you can't change these, then an alternative would be to modify the output directly (e.g. with a regex) which should be possible since the constraints all have the same form.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Events triggered by actions #17

Events triggered by actions #17

neighthan commented Jun 22, 2022

neighthan commented Jun 22, 2022

m312z commented Jun 23, 2022

DerekLong101 commented Jun 23, 2022 via email

neighthan commented Jul 14, 2022

m312z commented Jul 22, 2022

m312z commented Jul 22, 2022

Events triggered by actions #17

Events triggered by actions #17

Comments

neighthan commented Jun 22, 2022

neighthan commented Jun 22, 2022

m312z commented Jun 23, 2022

DerekLong101 commented Jun 23, 2022 via email

neighthan commented Jul 14, 2022

m312z commented Jul 22, 2022

m312z commented Jul 22, 2022