Handle on waiting #1483

Timple · 2024-04-16T09:16:03Z

That was easier than anticipated, fixes: #1482 and #1200

Timple · 2024-04-16T09:23:14Z

Note, I didn't touch the other except KeyboardInterrupt: statement as it actually does work when controllers are successfully loaded.

christophfroehlich · 2024-04-16T09:39:51Z

Does it make sense to add this to the unspawner script as well?

Timple · 2024-04-16T12:07:52Z

Good question! That doesn't have the same check in it.

Coming to think of this, since asking for permission instead of forgiveness is an antipattern (and actually breaks stuff: #1200).

I would opt for modifying this PR to align spawner and unspawner to both retry-and-log instead of the more expensive check-and-try.

christophfroehlich

thanks! LGTM and tested with/without running CM.

would you mind adding hardware_spawner also to the helper scripts section in userdoc.rst, as you have fixed the arguments of the other scripts already there?

edit: the tests are failing now, could you also have a look there please? (see RHEL/debian workflows)

controller_manager/doc/userdoc.rst

Timple · 2024-04-19T08:35:31Z

I need some assistance here. CI is failing on:

    Could not find a package configuration file provided by "test_msgs" with
    any of the following names:
  
      test_msgsConfig.cmake

But I didn't change anything related to test_msgs here?

christophfroehlich · 2024-04-19T20:12:03Z

But I didn't change anything related to test_msgs here?

This was some issue with the rolling CI due to the transition to Ubuntu noble. The issues related to your changes are

[ RUN      ] TestLoadController.spawner_without_manager_errors
...
[WARN] [1713556932.603576757] [spawner_ctrl_1]: Could not contact service /controller_manager/list_controllers
[WARN] [1713556942.609051609] [spawner_ctrl_1]: Could not contact service /controller_manager/list_controllers
[WARN] [1713556952.614319281] [spawner_ctrl_1]: Could not contact service /controller_manager/list_controllers
[WARN] [1713556962.619689181] [spawner_ctrl_1]: Could not contact service /controller_manager/list_controllers
[WARN] [1713556972.625026280] [spawner_ctrl_1]: Could not contact service /controller_manager/list_controllers

fmauch · 2024-04-23T11:00:42Z

I didn't have a massive amount of time to look into things the last couple of days but I wanted to finally fix #1182 today. My result is in #1501. This PR basically solves the first problem from #1182, as well, so +1 from me. But seeing the

Could not contact service /controller_manager/list_controllers

indicates that there is still something missing.

Edit: Actually, looking at the test it is obvious that this fails since this PR explicitly changes the spawner not to fail if it can't find the CM while the test expects it to fail. If that's the desired behavior that test probably simply has to be removed.

Timple · 2024-04-23T17:56:04Z

I dropped the test. As it was testing if false inputs gave no result. It could be written if a warning was issued, but that would mean parsing stdout or stderr in tests. Which is also a bit frowned upon.

fmauch

The changes look good and I manually tested them in my setup.

If the removal of the controller_manager_timeout parameter is acceptable I think this is actually a good solution. However, this is a breaking change, so now would be the best time to merge this.

Porting this back to Iron & Humble isn't that straightforward, we could use my implementation from #1501 for that where I also removed the wait for controller_manager functionality but added the controller_manager_timeout for the first service call. Or we do indeed remove the functionality and print a deprecation warning on those if the timeout is not the default value.

fmauch · 2024-04-24T18:11:43Z

Discussing in the working group today, we agreed on the following things:

The spawners should allow setting a definite timeout in order to make launchfiles actually fail if there's something wrong. Regarding the initial service wait we could use my proposal from e25ea4f. With that users can still define a timeout of 0 in order to make it wait indefinitely. This way this PR would also be directly backportable to Iron and Humble.
We'll first merge this and then I'll adjust Robustify spawner #1501 to only contain the missing changes.

@Timple Would you like to update this PR according to the first point please?

Timple · 2024-04-24T19:09:28Z

Regarding backporting, keeping the original timeout with the option of setting it to indefinitely makes sense!

However, I strongly opt for this to be the default in future releases. We've had flaky simulation and hardware startups because the default timeout was close to the boot time of the nodes. Flaky behavior seems the worst of all possible scenarios.
Would this be alright?

fmauch · 2024-04-24T19:23:56Z

I think @bmagyar might have an opinion on that.

In the end that boils down to the default value of the timeout. I agree that a larger default timeout than 10 seconds might make a lot of sense, but in the spirit of making it fail by default and not leave launch files hanging indefinitely it might be good to still have one.

Timple · 2024-04-24T19:47:18Z

Yes, but defaults are very important.
I fail to grasp how a node printing once and failing in the startup noise is a more clear message than an error/warning being repeatedly printed.

Only scenario I can think of is some kind of catch mechanism that restarts/reboots. But very likely the result will be exactly the same, but debugging much harder as everything keeps restarting and printing a lot.

If I'm missing a scenario here, or proper scenarios were discussed at the working group, let me know!

fmauch · 2024-04-25T10:15:45Z

One scenario is having launchfiles in an autostart job, e.g. a systemd unit where the shell output will never been seen by the user. In this case a failed unit is much more obvious to find than a unit repeating a log output.

Timple · 2024-04-25T15:29:57Z

That's a valid use case. If the logs of the autostart jobs are evaluated before the ROS logs.

Although I'd rather have that specific case as exception than flaky simulation and hardware startups.

Aligns spawner and unspawner logic

…er logic

Co-authored-by: Christoph Fröhlich <[email protected]>

destogl · 2024-05-08T17:50:28Z

We should keep timeout at it was and just add new option here.

mergify · 2024-06-05T17:26:28Z

This pull request is in conflict. Could you fix it @Timple?

bmagyar · 2024-06-05T17:36:54Z

Closing in favour of #1562

github-actions bot requested review from bijoua29, bmagyar, destogl, erickisos, fmauch, jaron-l and VX792 April 16, 2024 09:16

christophfroehlich requested changes Apr 16, 2024

View reviewed changes

christophfroehlich reviewed Apr 16, 2024

View reviewed changes

controller_manager/doc/userdoc.rst Outdated Show resolved Hide resolved

saikishor mentioned this pull request Apr 19, 2024

Handling of exclusive command interfaces #1487

Open

Timple force-pushed the fix/spawner-interrupt branch from 121a8a5 to 1941427 Compare April 22, 2024 12:54

fmauch mentioned this pull request Apr 23, 2024

Robustify spawner #1501

Merged

fmauch approved these changes Apr 23, 2024

View reviewed changes

Timple added 4 commits April 29, 2024 08:30

Handle on waiting

f1f86ef

fixup! Handle on waiting

e20e1e7

Ask for forgiveness, not permission

db9a1cd

Aligns spawner and unspawner logic

fixup! Ask for forgiveness, not permission Aligns spawner and unspawn…

ea1028d

…er logic

Timple and others added 4 commits April 29, 2024 08:30

Update docs on hardware_spawner

b59dd46

Update controller_manager/doc/userdoc.rst

be27c5a

Co-authored-by: Christoph Fröhlich <[email protected]>

Drop test which relied on timeout

18d1652

Bring back --controller-manager-timeout option

7c2cdcc

Timple force-pushed the fix/spawner-interrupt branch from 9368265 to 7c2cdcc Compare April 29, 2024 07:12

fixup! Bring back --controller-manager-timeout option

afa4101

VinDp mentioned this pull request Jun 5, 2024

UR16e bringup in rviz2 laying down UniversalRobots/Universal_Robots_ROS2_Driver#1015

Open

1 task

bmagyar approved these changes Jun 5, 2024

View reviewed changes

christophfroehlich approved these changes Jun 5, 2024

View reviewed changes

bmagyar mentioned this pull request Jun 5, 2024

Handle on waiting #1562

Merged

bmagyar closed this Jun 5, 2024

This was referenced Aug 14, 2024

Handle on waiting (backport #1562) #1680

Merged

Handle on waiting (backport #1562) #1681

Merged

Robustify spawner (backport #1501) #1686

Merged

Robustify spawner (backport #1501) #1687

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle on waiting #1483

Handle on waiting #1483

Timple commented Apr 16, 2024 •

edited

Loading

Timple commented Apr 16, 2024

christophfroehlich commented Apr 16, 2024

Timple commented Apr 16, 2024

christophfroehlich left a comment •

edited

Loading

Timple commented Apr 19, 2024

christophfroehlich commented Apr 19, 2024

fmauch commented Apr 23, 2024 •

edited

Loading

Timple commented Apr 23, 2024

fmauch left a comment

fmauch commented Apr 24, 2024

Timple commented Apr 24, 2024

fmauch commented Apr 24, 2024

Timple commented Apr 24, 2024

fmauch commented Apr 25, 2024

Timple commented Apr 25, 2024

destogl commented May 8, 2024

mergify bot commented Jun 5, 2024

bmagyar commented Jun 5, 2024

Handle on waiting #1483

Handle on waiting #1483

Conversation

Timple commented Apr 16, 2024 • edited Loading

Timple commented Apr 16, 2024

christophfroehlich commented Apr 16, 2024

Timple commented Apr 16, 2024

christophfroehlich left a comment • edited Loading

Choose a reason for hiding this comment

Timple commented Apr 19, 2024

christophfroehlich commented Apr 19, 2024

fmauch commented Apr 23, 2024 • edited Loading

Timple commented Apr 23, 2024

fmauch left a comment

Choose a reason for hiding this comment

fmauch commented Apr 24, 2024

Timple commented Apr 24, 2024

fmauch commented Apr 24, 2024

Timple commented Apr 24, 2024

fmauch commented Apr 25, 2024

Timple commented Apr 25, 2024

destogl commented May 8, 2024

mergify bot commented Jun 5, 2024

bmagyar commented Jun 5, 2024

Timple commented Apr 16, 2024 •

edited

Loading

christophfroehlich left a comment •

edited

Loading

fmauch commented Apr 23, 2024 •

edited

Loading