-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reused shared memory with PREALLOCATED_WITH_REALLOC
#783
Comments
PREALLOCATED_WITH_REALLOC
These failures can be explained by the documentation in eProsima's datasharing-delivery-constraints:
In the failure My question is, should the subscription/reader log some warning in this case? Like some QoS mismatch event: The goal would be for the user for somehow to know that if he wants shared memory to safely handle memory (i.e. its stored messages won't silently be overridden):
Maybe this makes sense to just be mentioned on the ROS2 documentation about shared memory & loaned messages. |
Currently the user can still ask for Loaned Messages even if all the shared memory has been exhausted, like when the previous loaned messages hasn't been returned (either because the user has copies, or subscriptions has them in its buffers). In this case, the user is provided with memory still in use, which will silently override user/subscription stored messages. I think the right behavior would be for the publisher to throw when the user requests loans and memory has been exhausted. What do you think? |
IMO, sounds reasonable if they are doing data-sharing. and this is probably request for Fast-DDS, and we (ROS 2) can catch the QoS incompatibility event from rmw implementation?
IMO this depends on the application requirement.
|
@mauropasse Currently, the only way to make the publisher block till the shared payload is not being used is using RELIABLE, and KEEP_ALL. You can also increase One thing that could be done is adding a warning when a sample is marked for reuse without it being acknowledged. There is this callback in the DataWriterListener that could be of use.
The problem with this is that HistoryQos is not transmitted in discovery. |
Great information @MiguelCompany. I'll explore the options! |
@MiguelCompany I tested these scenarios based on your suggestions:
If the subscription is not spinning but it receives the loaned messages, I guess the ACK is still sent in DDS? So the |
No dynamic allocations here. The only possible issue is that, since the publisher is blocked until the oldest sample is acknowledged (or
That is awkward. Was the reader also |
Yes. So this is the scenario just to clarify:
When is usually the ACK sent by the subscription? When executed or when message is received in DataReader? The only case in which I manage to see this callback in action (with also the logs you added in the commit) is setting:
|
Since my test is single process, the send_datasharing_ack() is not called, but I'm unsure if is the same ACK sender that causes the callback |
I expanded the test to check multi-process behavior (pub/sub in different processes):
So, in most cases, the A different approach to verify if memory is still in use could be:
Could logic be implemented to track this? What do you think @MiguelCompany |
@mauropasse thank you for checking this, for us this is really interesting topic!
this is because that ACK is sent to the corresponding publisher when the message is taken on the subscription? right?
for me, this looks like okay by design of i think we need the Just FYI, with previous comment on #783 (comment)
I do not think this is gonna communicate at all...
DDS detected the incompatible QoS, and no messages are delivered to this concerned subscription. because DDS cannot guarantee RELIABLE durability on this subscription. |
@mauropasse @fujitatomoya This is in fact getting interesting...
@fujitatomoya you nailed it! It seems Fast DDS is acknowledging the sample when taking it. So the following table is a summary on when samples are acknowledged
I think it would be better to extend the current mechanism and make it possible to acknowledge the sample when the loan is returned. This way we could use Either that or expose
|
I thought of another possibility. The user could be warned when an item is added to the LoanManager, and the address of the sample already exists there: rmw_fastrtps/rmw_fastrtps_shared_cpp/src/rmw_take.cpp Lines 534 to 539 in 9d2150f
|
as Fast-DDS, i think this sounds more robust and reliable? is there any side effect for this?
IMO, either way (above or below) we take,
this sounds reasonable in ROS 2 RMW implementation. |
I've been testing the approaches using
So we have then,
In both cases, the warnings happen after the messages has been overriden. --- a/rmw_fastrtps_shared_cpp/include/rmw_fastrtps_shared_cpp/custom_publisher_info.hpp
+++ b/rmw_fastrtps_shared_cpp/include/rmw_fastrtps_shared_cpp/custom_publisher_info.hpp
+ void on_unacknowledged_sample_removed(
+ eprosima::fastdds::dds::DataWriter* datawriter,
+ const eprosima::fastdds::dds::InstanceHandle_t& handle) override
+ {
+ RCUTILS_LOG_WARN_NAMED(
+ "rmw_fastrtps_shared_cpp",
+ "A shared message held by a subscription has been overriden.");
+ }
+
private:
RMWPublisherEvent * publisher_event_;
};
--- a/rmw_fastrtps_shared_cpp/src/rmw_take.cpp
+++ b/rmw_fastrtps_shared_cpp/src/rmw_take.cpp
void add_item(std::unique_ptr<Item> item)
{
std::lock_guard<std::mutex> guard(mtx);
+ // Check if the new item already exists in the list
+ for (const auto& existing_item : items) {
+ if (existing_item->data_seq.buffer()[0] == item->data_seq.buffer()[0]) {
+ RCUTILS_LOG_WARN_NAMED(
+ "rmw_fastrtps_shared_cpp",
+ "Subscription recieved a message still held by the user (which was overridden)"); }}
|
@mauropasse no major objections, i think this is better for user application. one concern is that when we are using the loaned messages, usually what we care most is performance. adding the extra checking to crawl through the |
@fujitatomoya I think it wouldn't hurt performances much. Only in an (uncommon?) situation, there may be some CPU usage when a non-spinning subscription (with many stored messages) starts spinning, but the user doesn't discard the messages. For each The overhead comes then from iterating over the messages that the user has kept outside of the subscription callback, when a new message is processed. So I think for most situations, it would be iterating over an empty list? On other topic, what it worries me is that we still don't have a warning for single-process case where a subscription keeps duplicated an overridden messages. |
@mauropasse yeah probably the overhead can be ignored mostly when the application returns the memory to the middleware.
is this really configurable with Fast-DDS? i mean History Object is coupled between DataWriter and DataReader, that means i think subscription depth is also managed with |
On the PR ros2/rclcpp#2624 I provide a fix to allow the user to make copies of loaned messages, keeping them beyond the subscription callback scope. With the fix, the loaned message is returned to the DDS when the user's copy of the message goes out of scope.
In the PR I add a unit test to verify that memory is not reused until all entities have returned the loan.
The tests fully pass when history memory policy is set as
DYNAMIC_REUSABLE
, but fails when set asPREALLOCATED_WITH_REALLOC
:The failures mean that a user is provided with loaned memory which has been previously provided, and is still in use:
While a temporary solution would be to only use
DYNAMIC_REUSABLE
on shared memory, this mode has the issue that a undesired copy is performed on a multi-process system, as I describe in #782So in short:
Do you think this is a bug, or the use of
PREALLOCATED_WITH_REALLOC
inherently breaks proper shared memory management?The text was updated successfully, but these errors were encountered: