Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

break Node into several separate interfaces #277

Merged
merged 38 commits into from
Dec 10, 2016
Merged

break Node into several separate interfaces #277

merged 38 commits into from
Dec 10, 2016

Conversation

wjwwood
Copy link
Member

@wjwwood wjwwood commented Nov 17, 2016

This is a work in progress, I've only converted some of the most basic functionality to what I'm calling the "NodeBaseInterface". This is a preview of what I'm doing for the other parts, for which I've sketched out "NodeTopicInterface", "NodeServiceInterface", "NodeGraphInterface", and "NodeParameterInterface".

I've run into a few issues, but specifically I have some work to do to resolve the ownership issues. This is demonstrated by this:

https://github.com/ros2/rclcpp/compare/node_interfaces?expand=1#diff-14f85fa16e78532d255620f1d3529debR52

Where the comment reads "These are used to keep the scope on the containing items" and I had to comment out the Node pointer. Had we gone with inheritance, the NodeBase pointer this class would have access to would have been enough, but given the composition nature, I cannot just hold that part of the interface and maintain the same guarantee. That's ok though, I never liked how we have so much "ownership spaghetti" (for lack of a better phrase) anyways. I just have to resolve how to handle this now. I've got the options and arguments for one way versus another partially written out, I'll try to post those soon. @Karsten1987 recently ran into this in a related situation and thinks (if I can speak for him a bit) that moving away from scoped objects would help a little bit because it could ensure the lifetime of the entities created by a node do not outlive it (using a proxy or handle pattern instead).

@wjwwood wjwwood added the in progress Actively being worked on (Kanban column) label Nov 17, 2016
@gbiggs
Copy link
Member

gbiggs commented Nov 29, 2016

I think I could do with a diagram. :bowtie:

@wjwwood
Copy link
Member Author

wjwwood commented Dec 1, 2016

I justed pushed the publisher and subscription parts of the refactor, rclcpp is building for me locally, but the examples are broken because I'll have to add some convenience functions for things like rclcpp::spin() which take Node * and extract the NodeBaseInterface * from it.

@gbiggs I did some design work upfront and as part of it made this:

node_interfaces

I'm not sure it will end up looking exactly like that, but it will be similar.

@wjwwood
Copy link
Member Author

wjwwood commented Dec 1, 2016

Things to point out in the pub/sub refactor:

  • Node still has code in the create_publisher<>() like methods, but it's mostly variable checking.
  • As much of the code that depends on the Message type or the Allocator type was put into a non-templated factory class, i.e. PublisherFactory and SubscriptionFactory.
  • The non-template argument dependent code is in the NodeTopics::create_* functions which are also not templated and take a factory class as an argument.

The Node::create_*<MessageT, ...>() methods basically do this:

  • Check arguments are valid.
  • Create a *Factory class by calling the matching create_*_factory<MessageT, ...>() function.
  • Pass the factory class to the NodeTopics::create_*(..., factory, ...) method, and get a PublisherBase or SubscriptionBase shared pointer back.
  • Pass the *Base class to the corresponding NodeTopics::add_*() method.

We could consider putting the code that lives in NodeTopics::create_*() back into the templated Node::create_*<MessageT, ...>() methods, but by pushing the code into the NodeTopics class we minimize the code that is in the Node::create_*<MessageT, ...>() methods and by doing so gain:

  • shorter compile times
  • reduced amount of code that has to be duplicated if someone want to copy-paste-modify the Node class

Similarly, we could put the code that now lives in the factory classes (mostly code that depends on template variables in one way or another) back into the Node::create_*<MessageT, ...>() methods, but by putting it into a separate factory class it again improves the reusability of this code and would help whenever someone might want to copy-paste-modify the Node class. It especially reduces the amount of non-trivial, or non-boilerplate, code that needs to be copied when creating a new Node class.


It was not possible to have template functions on either the NodeTopicsInterface class or the NodeTopics class without also templating the Node class, which I wanted to avoid. That is why I didn't just move the templated code from the Node class into the NodeTopics class.

@wjwwood
Copy link
Member Author

wjwwood commented Dec 1, 2016

@gbiggs just some follow up points to go with that diagram:

We wanted to accomplish a few things:

  • Make it easier to mock up parts of the Node interface to more easily test the components that use them without instantiating a full Node implementation to do so.
  • More clearly separate the various functionality that is aggregated by the Node class for the convenience of the user.
  • Enable the creation of various other kinds of Node classes, e.g. LifecylceNode or NodeWithoutParameters.

We landed on a few things:

  • Separate the current state into these discrete parts:
    • Base: core functionality and the rcl_node_t handle.
    • Topics: publisher and subscription related functionality
    • Services: service client and server related functionality
    • Parameters: parameter related functionality
    • Graph: ROS graph related functionality, i.e. wait_for_service, list_topics, etc.
  • Use pure virtual "interface" classes for each of the parts, in order to make mocking them easier without worrying about side effects of the constructor.
  • Use composition, rather than inheritance, to aggregate the functionality of each of parts into the Node class, i.e. Node has pointers to NodeBaseInterface, NodeTopicsInterface, etc. and provides forwarding methods (similar to the PIMPL pattern) rather than inheriting from them.

We discussed several of these points a lot internally and I have notes, but they're not organized at all, but I plan to put them somewhere as memoranda when I get a chance.

#include "rclcpp/macros.hpp"
#include "rclcpp/node.hpp"
// #include "rclcpp/node.hpp"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be removed?

@@ -44,7 +49,7 @@ struct AnyExecutable
rclcpp::client::ClientBase::SharedPtr client;
// These are used to keep the scope on the containing items
rclcpp::callback_group::CallbackGroup::SharedPtr callback_group;
rclcpp::node::Node::SharedPtr node;
// rclcpp::node::Node::SharedPtr node;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be removed?

RCLCPP_PUBLIC
virtual
rclcpp::context::Context::SharedPtr
get_context() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the method is const shouldn't it return a const shared pointer? Otherwise the caller can modify the returned value. Maybe we need both: one const method with const return value and one non-const method with a non-const return value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I didn't make this const originally (this method was just moved not modified or added) because the shared ptr that is returned is a copy. From what I understand it is not required and not appropriate to return a const of a copy, e.g. bool get_x() const and not const bool get_x() const. Instead you return a const when you return a reference, e.g. const MyClass & get_my_class() const rather than MyClass & get_my_class() const.

So if you consider the shared pointer to be a copy, then it doesn't need to be const, i.e. if someone changes the shared ptr that was returned, or calls reset of on it, it would not change the class. However, since it is a pointer, you could modify the thing to which it points and so maybe it should be a ::ConstSharedPtr as we call it (aka std::shared_ptr<const Context>). Getting rid of the shared pointer and just using a raw pointer, it is currently Context * get_context() const and maybe it should be Context const * get_context() const (reads "the const method get_context returns a pointer to a const Context").

I'm not sure there is a needs to be a const and non const version. I think a non const version is fine, and if the method could be const then that would be better, but not necessary.

So the question is whether or not we should leave it as is, or if we should have it return a SharedPtr but not be a const method, or if we should have it return a ConstSharedPtrand be a const method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view if you get a const Node then you are not supposed to do anything non-const on the node. But if the class has const methods which expose non-const internals you can call that method and then manipulate the internal state. Therefore const methods should always expose internal state as const.

@@ -52,7 +52,7 @@ class AllocatorMemoryStrategy : public memory_strategy::MemoryStrategy
using VoidAllocTraits = typename allocator::AllocRebind<void *, Alloc>;
using VoidAlloc = typename VoidAllocTraits::allocator_type;

using WeakNodeVector = std::vector<std::weak_ptr<node::Node>>;
// using WeakNodeVector = std::vector<std::weak_ptr<node::Node>>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be removed?

@@ -266,7 +266,7 @@ class AllocatorMemoryStrategy : public memory_strategy::MemoryStrategy
any_exec->subscription = subscription;
}
any_exec->callback_group = group;
any_exec->node = get_node_by_group(group, weak_nodes);
// any_exec->node = get_node_by_group(group, weak_nodes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be removed?

RCLCPP_PUBLIC
virtual
rcl_node_t *
get_rcl_node_handle() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing applies here, despite the fact that it isn't a smart pointer.

RCLCPP_PUBLIC
virtual
std::shared_ptr<rcl_node_t>
get_shared_rcl_node_handle() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

RCLCPP_PUBLIC
virtual
std::unique_lock<std::recursive_mutex>
acquire_notify_guard_condition_lock() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the ownership being changed this shouldn't be const?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The returned lock is a temporary and the mutex is mutable, so I'd say no.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did't' know it is a temporary. That makes sense then.

callback_group(nullptr),
node(nullptr)
callback_group(nullptr)
// node(nullptr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be removed?

// Finalize the interrupt guard condition.
finalize_notify_guard_condition();

throw std::runtime_error("failed to interpret ROS_DOMAIN_ID as integral number");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this code block has only been moved this line leaks the memory of ros_domain_id on Windows.

@wjwwood
Copy link
Member Author

wjwwood commented Dec 2, 2016

I just pushed the graph API refactor and rebased since @Karsten1987's pr was merged.

I'll address the commented out things and other comments once it is in review.

@wjwwood
Copy link
Member Author

wjwwood commented Dec 3, 2016

I pushed the NodeServices API portion of the refactor and fixed some things added by @Karsten1987's pr which broke due to my changes. Only the NodeParameters part is left to go and to decide where to put the "create_timer" functions, either in a NodeTimers API part, or in the NodeBase part. I'm leaning towards giving timers their own part, mostly because it doesn't seem to fit in the NodeBase API very well (most things that use NodeBaseInterface don't need timers).

@Karsten1987
Copy link
Contributor

+1 for giving timers their own part

@gbiggs
Copy link
Member

gbiggs commented Dec 4, 2016

I wonder if creating timers, etc. should be done via factory functions that are friends of the base node interface (or node implementation? I forget which is correct for friends). I think that this should allow any object that provides the necessary friend-accessible interface to allow the creation of timers, and would also keep the size of the node object APIs minimal. One of the popular C++ book authors wrote several articles and book chapters about keeping class APIs minimal by making good use of friend and non-friend functions when possible.

I haven't analysed this idea in any way, but I thought it's worth consideration.

@gbiggs
Copy link
Member

gbiggs commented Dec 4, 2016

Thanks for the diagram and explanations, @wjwwood. It's all stuff I've been hoping for, so I'm really happy to see it happening.

@wjwwood
Copy link
Member Author

wjwwood commented Dec 6, 2016

I just pushed Parameters and Timers API's. Now I just need to do some testing and address some existing comments.

@wjwwood wjwwood added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Dec 7, 2016
@wjwwood
Copy link
Member Author

wjwwood commented Dec 7, 2016

Ok, I've addressed or continued discussion on the existing comments. I've also started some CI which will run tonight and I'll check on tomorrow. One thing I'm thinking of doing still is to remove the publisher and subscription factories I created since they don't make as much sense anymore since I've made the templated free functions create_publisher and create_subscription. It's not strictly necessary, but would be easier to read I think, as the factory breaks the flow of the code up considerably.

I put this in review.

@wjwwood
Copy link
Member Author

wjwwood commented Dec 7, 2016

Ok, I think I've addressed the constness issues, now there is a tlsf test failing which may be something I broke in the allocator support that I need to look into. Still looking for more reviews in the meantime.

@wjwwood
Copy link
Member Author

wjwwood commented Dec 9, 2016

Passing CI:

  • Linux:
    • Build Status
  • OS X:
    • Build Status
  • Windows:
    • Build Status


auto pub = node_topics->create_publisher(
topic_name,
rclcpp::create_publisher_factory<MessageT, AllocatorT, PublisherT>(allocator),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this factory when we pass the allocator as a template argument?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The factory captures all template code into non template functions so they can be interleaved with non template code in the cpp file. As I mentioned before, I think we could remove the factories now in favor of more linear code that lives in the template functions. The trade off being that there is more code in template functions which increase compile time.

I'd rather do that refactor later though so that we can merge this and let other work build on top in the mean time.

node_topics->add_publisher(
pub,
// this is the callback group, not currently used or exposed to the user for publishers
nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could that become a default argument for create_publisher function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it needs to because there's no reason to imply that passing it to create publisher has any effect. If anything I'd remove it from the add publisher function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the argument in c18b6ab, I'll run CI on Linux only (due to the nature of the change that should be enough).

const rmw_qos_profile_t & qos_profile,
rclcpp::callback_group::CallbackGroup::SharedPtr group,
bool ignore_local_publications,
bool use_intra_process_comms,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the two booleans go into a dedicated struct. Similar to the qos_profile_t maybe something like intra_process_profile_t? Also to lower the number of arguments here. Or expose the subscription_options ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need an options structure in c++ for these classes, but I didn't want to tackle that in this pr.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm creating an issue for this right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RCLCPP_PUBLIC
virtual
void
add_service(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not separating clients and services?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean why is add_client in NodeServices and not in NodeServiceClients ? Or, if mixed, where is the constructor for ServiceClient ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clients and Services (Service servers) are two halves of the same part: "services". It is not separate for the same reason that publisher and subscription related functions are together on the NodeTopics interface: because they are related. I don't see any need to break up the NodeServices interface into NodeServiceClients and NodeServiceServers, but if you do, please make an argument for that case.

where is the constructor for ServiceClient?

I don't understand the question, there is no class called ServiceClient.

class NodeTimers : public NodeTimersInterface
{
public:
RCLCPP_SMART_PTR_ALIASES_ONLY(NodeTimers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this macro? I do believe that I ran into problems when inheriting from a class with these macros.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this before, and while I agree that the thing you ran into was a bad user experience, I don't think it should be addressed in this pr and should perhaps only be changed for classes from which people commonly inherit. Please make a new issue to discuss this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make an issue for this, so it is tracked?

EventNotRegisteredError()
: std::runtime_error("event already registered") {}
};

/// Node is the single point of entry for creating publishers and subscribers.
class Node : public std::enable_shared_from_this<Node>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the new architecture allow to create Services inside the node constructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I haven't tested it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a test for this real quick.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a test in ros2/system_tests#182 and fixed it in f0afcab 😛, good thing you got me to test it.

using IDTopicMap = std::map<std::string, AllocSet,
std::less<std::string>, RebindAlloc<std::pair<const std::string, AllocSet>>>;

struct strcmp_wrapper : public std::binary_function<const char *, const char *, bool>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we exclude this struct into an external header file? MIght be helpful for other pieces of code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel it needs it's own file, it's a pretty trivial class, I would have made it a lambda if I could have. I'd prefer not to separate it, but you can push here with that change if you like.

@@ -117,6 +99,14 @@ class Node : public std::enable_shared_from_this<Node>
rclcpp::callback_group::CallbackGroup::SharedPtr
create_callback_group(rclcpp::callback_group::CallbackGroupType group_type);

using CallbackGroup = rclcpp::callback_group::CallbackGroup;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we shift that line before the first usage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just removed it and the following line, not sure why I added that there. Might have been a copy-paste issue, thanks for pointing it out. 28a2cfc

}

template<typename MessageT, typename CallbackT, typename Alloc>
template<typename MessageT, typename CallbackT, typename Alloc, typename SubscriptionT>
typename rclcpp::subscription::Subscription<MessageT, Alloc>::SharedPtr
Node::create_subscription(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of consistency, I would also bring that forward function into the create_subscription.cpp file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is (a) a template function which cannot be put in a cpp file and (b) is a method of the Node class, so it would go in the node.cpp file if it could go in a cpp file.

template<typename CallbackType>
typename rclcpp::timer::WallTimer<CallbackType>::SharedPtr
template<typename DurationT, typename CallbackT>
typename rclcpp::timer::WallTimer<CallbackT>::SharedPtr
Node::create_wall_timer(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, just to be consistent, I would also put the create_wall_timer into its own file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

private:
RCLCPP_DISABLE_COPY(NodeBase)

std::string name_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? Does it not make sense to hit directly to rcl for getting the name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I updated it in 83b610e. Thanks!

RCLCPP_PUBLIC
virtual
void
add_service(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean why is add_client in NodeServices and not in NodeServiceClients ? Or, if mixed, where is the constructor for ServiceClient ?


#include <string>

#include "rclcpp/node_interfaces/node_timers.hpp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't that include be on first?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that, but I think cpplint was not happy about it for some reason. I'll try again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it here and in other places in 23a7728.

}

rclcpp::subscription::SubscriptionBase::SharedPtr
NodeTopics::create_subscription(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not in create_subscription.cpp ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

{}

rclcpp::publisher::PublisherBase::SharedPtr
NodeTopics::create_publisher(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not in create_publisher.cpp ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is a member of NodeTopics?

@Karsten1987
Copy link
Contributor

This PR looks good to me. I had a few comments/questions about file ordering and consistency. But no show-stoppers.

@wjwwood
Copy link
Member Author

wjwwood commented Dec 9, 2016

Ok, @Karsten1987 I think I addressed your comments, can you check to make sure I did? Also, anyone else that wants to review, please do so now.

I'm starting new CI since I made some significant changes.

@wjwwood
Copy link
Member Author

wjwwood commented Dec 10, 2016

The warnings are being addressed by @codebot and I cannot reproduce the OS X test failures on my machine, and I am debugging the Windows test timeouts and I will fix that up after merging this if this pr turns out to have caused it.

@wjwwood wjwwood merged commit 734ac27 into master Dec 10, 2016
@wjwwood wjwwood deleted the node_interfaces branch December 10, 2016 01:09
@wjwwood wjwwood removed the in review Waiting for review (Kanban column) label Dec 10, 2016
nnmm pushed a commit to ApexAI/rclcpp that referenced this pull request Jul 9, 2022
* initialize memory before sending a message

* add todo referencing ticket

* wrap todo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants