Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid copies when combining intra-process communication and shared memory transports #2203

Open
alsora opened this issue Jun 2, 2023 · 4 comments
Assignees

Comments

@alsora
Copy link
Collaborator

alsora commented Jun 2, 2023

Feature request

ROS 2 supports multiple communication modes.
Consider the following scenario, with two processes:

  • a process contains publisher A and subscriber X
  • a process contains subscriber Y

Intra-process communication allow the publisher to send the message to subscriber X without copies.
Similarly if zero-copy shared memory transports are enabled, the publisher should be able to send the message to the subscriber Y without performing any copy.

However, the mechanism breaks when both modes are used at the same time: indeed inter and intra-process currently need different copies of the message.

Note that disabling intra-process communication is a suboptimal solution.
Indeed intra-process communication is faster and uses less CPU than shared memory transport, even if no copies are involved.
Depending on the size of the message, the performance penalty from not using intra-process comm may be larger than the overhead caused by the copy that is needed when combining the two transport modes.

However, it should be possible to get the best of both worlds when all subscribers are only interested in read-only access to the message.

This would require to essentially use the same loaned message for both inter and intra process deliveries.

@fujitatomoya
Copy link
Collaborator

@alsora thanks for the summary.

I think this also clears double buffering message, it would be really nice to have this enhancement.

This would require to essentially use the same loaned message for both inter and intra process deliveries.

IMO, I guess we see a few options here,

  • Just depends on RMW implementation for both intra and inter process message delivery. compared to current intra process communication, there is overhead and more software stacks. instead, we can take advantage of full functionalities such as QoS? besides, maintenance cost effective and clear design.
  • Keep intra process communication in rclcpp with using LoanedMessage. I am not sure if this is doable. or allowing the user application (rclcpp) to loan the message to the RMW implementation? to be honest, i am not sure how this should be implemented at this moment, but happy to discuss.

@alsora
Copy link
Collaborator Author

alsora commented Jun 2, 2023

When dealing with communication between two entities in the same process, we have multiple communication modes:
ordered from the most efficient to the least efficient

  • rclcpp intra-process
  • rmw intra-process
  • rmw shared memory

My understanding is that currently the only way to avoid the copy of the message is to use rmw shared memory between all pubs and subs: the rmw intra-process would have the same limitation as the rclcpp intra-process in this context (this is different from the issue described in #2202, which would be solved by using rmw intra-process)

@fujitatomoya
Copy link
Collaborator

My understanding is that currently the only way to avoid the copy of the message is to use rmw shared memory between all pubs and subs

AFAIK, Fast-DDS can already support this using LoanedMessage w/o ROS 2 intra-process option? in this case, i think it can support QoS as well. what do you mean the same limitation by here?

CC: @MiguelCompany

@alsora
Copy link
Collaborator Author

alsora commented Jun 3, 2023

The only problem is that using loaned messages between pubs and subs in the same process is more inefficient than using intra-process optimization mechanisms.

  • a process contains publisher A and subscriber X
  • a process contains subscriber Y

Assuming that the communication A -> Y is done through shared memory transport, we have the following options:

  1. communication A -> X is done through rclcpp intra-process
  2. communication A -> X is done through rmw intra-process
  3. communication A -> X is done through shared memory transport

Options 1) and 2) result in an extra copy of the messages, while option 3) has additional overhead due to the use of loaned message APIs.
You can see here #1642 (comment) that loaned messages have a latency approximately 4 times the one of rclcpp intra-process.

IMO all the 3 options above are currently suboptimal.

To improve the situation we can either:

  1. allow rclcpp intra-process and shared memory to coexist without extra copies
  2. improve the performance of the shared memory transport when used within a single process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants