Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orchestrator_examples/ring_test.xml hangs #324

Open
m8pple opened this issue Jul 13, 2022 · 2 comments
Open

Orchestrator_examples/ring_test.xml hangs #324

m8pple opened this issue Jul 13, 2022 · 2 comments

Comments

@m8pple
Copy link
Contributor

m8pple commented Jul 13, 2022

If I try to run Orchestrator_examples/ring_test.xml, then it compiles and loads correctly then appears to hang.

I'm assuming (?) this should work as it is the one from the Orch_VolII documentation. I also tried
Orchestrator_examples/ping_pong_test.xml, and that worked fine.

dt10@byron:~/Orchestrator$ Tests/ReferenceXML/run_app_standard_outputs.exp /home/dt10/Orchestrator_examples/ring_test.xml  60
Relative xml path = /home/dt10/Orchestrator_examples/ring_test.xml
Absolute xml path = /home/dt10/Orchestrator_examples/ring_test.xml
spawn sh -c /home/dt10/Orchestrator/Tests/ReferenceXML/../../orchestrate.sh 2>&1
POETS>load /app = "/home/dt10/Orchestrator_examples/ring_test.xml"
POETS> 21:58:28.01:  20(I) The microlog for the command 'load /engine = "../Config/POETSHardwareOneBox.ocfg"' will be written to '../Output/Microlog/Microlog_2022_07_13T21_58_28p0.plog'.
POETS> 21:58:28.02: 140(I) Topology loaded from file ||../Config/POETSHardwareOneBox.ocfg||.
POETS> 21:58:28.02:  23(I) load /app = "/home/dt10/Orchestrator_examples/ring_test.xml"
POETS> 21:58:28.02:  20(I) The microlog for the command 'load /app = "/home/dt10/Orchestrator_examples/ring_test.xml"' will be written to '../Output/Microlog/Microlog_2022_07_13T21_58_28p1.plog'.
POETS> 21:58:28.02: 235(I) Application file /home/dt10/Orchestrator_examples/ring_test.xml loading...
POETS> 21:58:28.02:  65(I) Application file /home/dt10/Orchestrator_examples/ring_test.xml loaded in 19 ms.
POETS>tlink /app = *
place /tfill = *
POETS>POETS> 21:58:28.02:  23(I) tlink /app = *
POETS> 21:58:28.02:  20(I) The microlog for the command 'tlink /app = *' will be written to '../Output/Microlog/Microlog_2022_07_13T21_58_28p2.plog'.
POETS> 21:58:28.02: 234(I) Typelinking graph instance 'ring_test_instance'...
POETS> 21:58:28.02: 249(I) Successfully typelinked graph instance 'ring_test_instance'.
POETS> 21:58:28.02:  23(I) place /tfill = *
POETS> 21:58:28.02:  20(I) The microlog for the command 'place /tfill = *' will be written to '../Output/Microlog/Microlog_2022_07_13T21_58_28p3.plog'.
POETS> 21:58:28.02: 309(I) Attempting to place graph instance 'ring_test_instance' using the 'tfil' method...
POETS> 21:58:28.02: 302(I) Graph instance 'ring_test_instance' placed successfully.
POETS>compose /app = *

POETS>POETS> 21:58:29.03:  23(I) compose /app = *
POETS> 21:58:29.03:  20(I) The microlog for the command 'compose /app = *' will be written to '../Output/Microlog/Microlog_2022_07_13T21_58_28p4.plog'.
POETS> 21:58:29.03: 803(I) Composing graph instance 'ring_test_instance'...
POETS> 21:58:29.03: 804(I) Graph instance 'ring_test_instance' composed successfully.
POETS>Graph appears to have loaded and compiled!

Waiting for 15.0 seconds to let HostLink init
deploy /app = *

initialise /app = *

run /app = *

POETS>POETS>POETS>POETS>POETS> 21:58:44.03:  23(I) deploy /app = *
POETS>POETS> 21:58:44.03:  20(I) The microlog for the command 'deploy /app = *' will be written to '../Output/Microlog/Microlog_2022_07_13T21_58_44p0.plog'.
POETS> 21:58:44.03: 184(I) Deployment of graph instance 'ring_test_instance' staged. Waiting for Mothership(s) to acknowledge receipt in the background.
POETS> 21:58:44.03:  23(I) initialise /app = *
POETS> 21:58:44.03:  20(I) The microlog for the command 'initialise /app = *' will be written to '../Output/Microlog/Microlog_2022_07_13T21_58_44p1.plog'.
POETS> 21:58:44.03: 187(I) Initialisation of graph instance 'ring_test_instance' staged. Waiting for Mothership(s) to acknowledge receipt in the background.
POETS> 21:58:44.03:  23(I) run /app = *
POETS> 21:58:44.03:  20(I) The microlog for the command 'run /app = *' will be written to '../Output/Microlog/Microlog_2022_07_13T21_58_44p2.plog'.
POETS> 21:58:44.03: 188(I) Run of graph instance 'ring_test_instance' staged. Waiting for Mothership(s) to acknowledge receipt in the background.
POETS> 21:58:44.03: 529(I) Mothership (rank 2): Deployment of application 'ring_test::ring_test_instance' (to this Mothership) complete.
POETS> 21:58:44.03: 186(I) Application 'ring_test::ring_test_instance' successfully deployed on all Motherships it is mapped to.
POETS> 21:58:44.03: 530(I) Mothership (rank 2): Initialising fully-defined application 'ring_test::ring_test_instance'.
POETS> 21:58:44.09: 531(I) Mothership (rank 2): Initialisation of application 'ring_test::ring_test_instance' (to this Mothership) complete.
POETS> 21:58:44.09: 186(I) Application 'ring_test::ring_test_instance' ready to start on all Motherships it is mapped to.
POETS> 21:58:44.09: 532(I) Mothership (rank 2): Starting (running) fully-initialised application 'ring_test::ring_test_instance'.
POETS> 21:58:44.09: 186(I) Application 'ring_test::ring_test_instance' running on all Motherships it is mapped to.
POETS>
STATS_fba956f3: load:0.009813, place:0.001493, compile:0.584081, run:60.070602

Timeout while running app, timeout=60.

Orchestrator version:

commit a2af253a39cabed7e401e062556949e51b330741 (HEAD -> development, origin/development, origin/HEAD)
Merge: e61a6f0 8edaf17
Author: Mark Vousden <[email protected]>
Date:   Wed Jul 6 11:29:13 2022 +0100

    Merge pull request #318 from POETSII/BUGFIX-0317-onsystpingack-warning

Orchestrator_examples:

commit 0b6d2d319aa716c4dcb875df2eb303ab14e20503 (HEAD -> development)
Merge: 6d7e7ad 3ad3ee0
Author: Graeme Bragg <[email protected]>
Date:   Thu Dec 16 03:53:04 2021 +0000

    Merge pull request #14 from POETSII/BUGFIX-0231-typenames
@m8pple
Copy link
Contributor Author

m8pple commented Jul 13, 2022

It looks to me like the logic is slightly wrong, as we have the ReadyToSend in the devices of:

https://github.com/POETSII/Orchestrator_examples/blob/3a3658b38179f1389c26a499538a82d7a6683c5a/ring_test.xml#L111-L115

That will cause either the sender pin to fire or the SupervisorOutPin to fire. Both of
those handlers finish with DEVICESTATE(sendMessage) = 0;. So either a device
message is sent along the ring and one to the supervisor is lost, or vice-versa.

The supervisor has count-down logic to check that all expected messages are received, so
if any messages are lost then it will hang. However, if any messages are sent
to the supervisor then the token in the ring is lost, which will also cause it to hang.

I might be missing something though - is this example expected to work, or is it for
documentation?

@mvousden
Copy link
Contributor

mvousden commented Jul 15, 2022

I haven't dived particularly deep into this one, but it looks like you're using an out-of-date example: orchestrator-examples at 0b6d2d3 was from seven months ago, whereas 3a3658b (development HEAD) is more recent.

It looks to me like the logic is slightly wrong, as we have the ReadyToSend in the devices of:

https://github.com/POETSII/Orchestrator_examples/blob/3a3658b38179f1389c26a499538a82d7a6683c5a/ring_test.xml#L111-L115

That will cause either the sender pin to fire or the SupervisorOutPin to fire. Both of those handlers finish with DEVICESTATE(sendMessage) = 0;. So either a device message is sent along the ring and one to the supervisor is lost, or vice-versa.

This is not correct - the RTS "bits" are not reset after every invocation of the ReadyToSend logic. What happens is:

  1. ReadyToSend code is invoked, and the RTS bit for the sender pin, and the RTS bit for the supervisor output pin, are both set high.
  2. The softswitch invokes the sender output pin code first [1]. The lap field of the outbound packet is populated, and the sendMessage field of device state is set to zero.
  3. This packet is sent, as per the softswitch's sending mechanism.
  4. The softswitch invokes the SupervisorOutPin code, populating the sourceId and lap fields of the next outbound packet, and setting the sendMessage field of device state to zero again.
  5. This packet is sent, as per step 3.

@heliosfa thoughts?

I might be missing something though - is this example expected to work, or is it for documentation?

Yes, it is supposed to work ;p

[1]: Strictly speaking, the order is undefined according to the Softswitch documentation, but the order doesn't matter in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants