Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for inclusion in Nav2 Stack #1

Closed
13 of 15 tasks
artofnothingness opened this issue Feb 10, 2022 · 113 comments
Closed
13 of 15 tasks

Proposal for inclusion in Nav2 Stack #1

artofnothingness opened this issue Feb 10, 2022 · 113 comments
Assignees
Labels
enhancement New feature or request

Comments

@artofnothingness
Copy link
Owner

artofnothingness commented Feb 10, 2022

There are some suggested features/fixes/updates to be done to consider this controller as a part of Nav2 Stack.

  • Footprint collision checking
  • Publishers use std::move
  • Support Y for omni robots (still requires some tests, but seems to work)
  • Pluginize Critics
  • Pluginize Model Not required.
  • Parameters handle lifecycle transition multiple times
  • A readme with the run-time rate / other information / parameters / link to relevant paper
  • Use of ament / colcon /package.xml for building so its easy for users
  • 20 Hz for 2 iterations // improve runtime performance for stability and quality behavior
  • Tuning out of the box parameters to give a decent baseline (Differential done - omni needed)
  • hardware testing / feedback
  • Smoothness of controls (critic? more iterations? faster cycling than 20hz?)
  • Support ackermann minimum turning radius constraints Ackermann constraints #42
  • Coverage for PathAngleCritic and ApproxReferenceTrajectoryCritic
  • Dynamic parameters (almost done)
@SteveMacenski
Copy link
Collaborator

Hi! Continuing from the other thread. I should start being able to look into this more directly start the week of March 2nd after a bunch of my current deadlines are over. My focus is going to be on controllers (MPPI, MPC, DWB) so this will be one of the major areas of interest.

All trajectories come from integrating random controls by n time steps. One way is to add critiс to already integrated trajectories to penalize based on the curvature constraints, other is to add some constraints on control actions, and may be change averaging policy. Don't know how to implement second approach, the first one seems doable.

I don't think we need to penalize curvature, I think that actually how this controller works already kind of does that naturally, but it wouldn't be hard to test and see what we think.

I think what is absolutely necessary though is being able to set a maximum curvature parameter (e.g. minimum turning radius) for which if a primitive exceeds, we throw it out entirely and not evaluate it. If we want to keep a consistent N primitives, we'd need to replace the ones we threw out with more. This could be accomplished also when randomly sampling to set the space of valid generation to include curvature (or check there and pass any that don't meet it so that the output is still the same size N).

It requires a little more work coz the main data structures are tensors, and not structs/classes. I'm on the way abstracting away from these nasty tensors in some parts, to make OMNI integration more smooth.

Yeah, I noticed that. One of the benefits of changing to some variables for the idx so that you don't need to update every single index number, just add the additional dimension and a new idx constant for the Y direction.

Normally I'd say that Omni is a "nice to have" but actually I'm thinking now that this work might fully replace DWB in Nav2 if things work out well, so its a "must have" so that we continue to have Omni support. Ackermann continues to be a "nice to have" but its easier to add it now than later and from the discussion above, it's not exactly rocket since to have it supported either.

@artofnothingness
Copy link
Owner Author

artofnothingness commented Feb 12, 2022

I think what is absolutely necessary though is being able to set a maximum curvature parameter (e.g. minimum turning radius) for which if a primitive exceeds, we throw it out entirely and not evaluate it. If we want to keep a consistent N primitives, we'd need to replace the ones we threw out with more. This could be accomplished also when randomly sampling to set the space of valid generation to include curvature (or check there and pass any that don't meet it so that the output is still the same size N).

Currently linear/angular velocity constraints are implemented. I'll think about additional constraints that you mentioned.

Yeah, I noticed that. One of the benefits of changing to some variables for the idx so that you don't need to update every single index number, just add the additional dimension and a new idx constant for the Y direction.

I had thoughts to keep data contiguous in the state/batch. Diffdrive/carlike state looks like [ VX, WZ, controlVX, controlWZ, dt (time intervals) ], and sometimes its efficient to have contiguous slice of current velocities from whole batch, e.g. [VX, WZ], so with Y it would be [ VX, VY, WZ ]. Consequently that requires index shifting. I've just implemented runtime version of such abstraction. https://github.com/artofnothingness/mppic/blob/develop/include/mppic/State.hpp

Now adding Y sampling/propagation

@SteveMacenski
Copy link
Collaborator

Why not always have the Y fields in place? Does it really add that much compute to solve the optimization problem to have 0-ed out tensor dimensions? That would make the code a bit easier to follow / less error prone with changes later w.r.t. dimensions.

This is awesome, I see you've made a bunch of changes here already!

@artofnothingness
Copy link
Owner Author

artofnothingness commented Feb 15, 2022

If someone will use diffdrive/carlike and e.g. neural network as "Model" to predict velocities, and tensor wouldn't be contiguous (zeros Y, zeros controlY) it would require additional data manipulation to get right batch. Slices would require additional attention too. I think having holes in tensor would be more error prune. xtensor already has dynamic size of each dimension, all is needed to adjust size of last dimension to state vector size, and use idxes that match chosen vector. I've been thinking about this the last few days. Some implemented abstractions already works quite well

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Feb 15, 2022

OK, I can get behind that

@artofnothingness
Copy link
Owner Author

artofnothingness commented Feb 21, 2022

Y seems to work. Tested with https://github.com/neobotix/neo_simulation2. Also pluginized critics and fix some bugs

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Feb 22, 2022

Oh wow, incredible. Can you show a gif? How do you feel about the behavior? What's the runtime you can get this to go at on your CPU?

I'm still working towards some deadlines that come up on March 1st (1 week). I have been tracking your changes briefly when I had time and had some comments, but I wanted to wait until it was all put together to share. Doing a deep dive over your changes is one of my first tasks (after I take a day or two off to relax).

One of the easier things to pull back in is back into 1 package. I know DWB spreads things out into a bunch of individual packages. That's mostly done because they're picking and choosing packages for internal vs external use. For this work, there is no "internal" use so this can all be shipped together as 1 unit.

Something you may consider doing is using the Costmap2D's collision checker object (https://github.com/ros-planning/navigation2/blob/main/nav2_costmap_2d/src/footprint_collision_checker.cpp) since you already have the dependency on Costmap2D. This will handle the footprint line iterator stuff for you for the full footprint. You can see an example of how I use it in Smac Planner to wrap it into an API for a more specific need (https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/include/nav2_smac_planner/collision_checker.hpp) but I think you could probably get away with the raw API. I did that because I needed to precompute some footprint orientations.

I'd also recommend keeping the option to use single point-costs vs the full footprint in the situations that people actually are using a circular robot. Collision checking is definitely not cheap. A parameter would do, or you could implement another critic that only does point-costs (like DWB does) and that would be fine as well.

@artofnothingness
Copy link
Owner Author

artofnothingness commented Feb 22, 2022

Oh wow, incredible. Can you show a gif? How do you feel about the behavior? What's the runtime you can get this to go at on your CPU?

Feels quite nice, but i've only tested it in gazebo and in a simple environment yet. Main observation - omni needs a larger batch_size
https://github.com/artofnothingness/mppic/blob/develop/.resources/demo-omni.gif

Tested it on intel i5-4200H CPU
For params:
iteration_count 10
batch_size 200
time_steps 10
Points in path 30
For omni its about 22 ms and diff drive almost the same - ~21 ms
I think it worth doing benchmarks with separate critics. Most of cpu time is on ReferenceTrajectoryCritic, which is really heavy now.

One of the easier things to pull back in is back into 1 package.

Yeah, that's easy change, and i feel the same, that it is redundant separation.

Something you may consider doing is using the Costmap2D's collision checker object (https://github.com/ros-planning/navigation2/blob/main/nav2_costmap_2d/src/footprint_collision_checker.cpp) since you already have the dependency on Costmap2D. This will handle the footprint line iterator stuff for you for the full footprint. You can see an example of how I use it in Smac Planner to wrap it into an API for a more specific need (https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/include/nav2_smac_planner/collision_checker.hpp) but I think you could probably get away with the raw API. I did that because I needed to precompute some footprint orientations.

Thanks, I'll take a look at it.

I'd also recommend keeping the option to use single point-costs vs the full footprint in the situations that people actually are using a circular robot. Collision checking is definitely not cheap. A parameter would do, or you could implement another critic that only does point-costs (like DWB does) and that would be fine as well.

Sure, that would make sense

@artofnothingness
Copy link
Owner Author

artofnothingness commented Feb 23, 2022

Added collision checker, consider_footprint parameter, and moved everything in one package

@SteveMacenski
Copy link
Collaborator

22ms

I can't tell you how much I love that number. TEB is way slower than that and DWB its often hard to get that running at 20hz on a raspberry pi. If you're getting 45hz on a 4th gen i5, I'm really excited about that!

@artofnothingness
Copy link
Owner Author

Anyhow that number highly depends on iteration_count/batch_size/time_steps/reference path point count

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Mar 1, 2022

This is true. We should probably cull out the reference path some distance away from the current pose so that its only looking at a window of poses in the immediate term. Is this already happening implicitly?

I see that its being sent the full global path but how do you handle the distance correlation when path points are far away?

We do this in RPP and DWB, but I think for how your https://github.com/artofnothingness/mppic/blob/develop/include/mppic/utils/geometry.hpp is setup, we don't need to be nearly as conservative and only need to rough-out the right window. Maybe just cull points more than 1.5x the longest possible primitive length away? I only took a quick glance at the closestPointsOnLinesSegment2D and distPointsToLineSegments2D functions, so perhaps you already have a way of skipping evaluating further path points once clearly too far away. That would be another potential option rather than formally culling out path points.

On the other points, we can definitely circle back to see if its a problem. But if you tuned your system and found generally good parameters for iteration_count/batch_size/time_steps that work well, and it gives you roughly that 22ms performance, I think that means its good. Certainly a user can badly tune those things. I think it would actually be useful if we found some really solid baseline numbers for iteration_count/batch_size/time_steps and made them default so that most users will have a great out of box experience for a typical mobile robot platform.

@artofnothingness
Copy link
Owner Author

artofnothingness commented Mar 2, 2022

I see that its being sent the full global path but how do you handle the distance correlation when path points are far away?

It's clipped by PathHandler before it's passed to Optimizer by some fixed distance (lookahead_distance) considering current pose. Do you mean dynamically change this distance, depending on how the closest path point is far away ?
https://github.com/artofnothingness/mppic/blob/develop/include/mppic/handlers/PathHandler.hpp#L30-L36
https://github.com/artofnothingness/mppic/blob/develop/include/mppic/ControllerImpl.hpp#L55

Maybe just cull points more than 1.5x the longest possible primitive length away

I had thoughts about it. It seemed to me that there was no need for this, since you can always manually set a static lookahead distance for global path clipping.

But if you tuned your system and found generally good parameters for iteration_count/batch_size/time_steps that work well, and it gives you roughly that 22ms performance, I think that means its good. Certainly a user can badly tune those things. I think it would actually be useful if we found some really solid baseline numbers for iteration_count/batch_size/time_steps and made them default so that most users will have a great out of box experience for a typical mobile robot platform.

I think it would be nice to add dynamic parameters to test different parameters settings. Sure targeting nice defaults is the goal.

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Mar 7, 2022

Back from PTO. This week is going to be a bit of catch up and I still need to work on getting a hardware platform to test this work on, but this MPPI and DWB work is my main focus for the immediate future (beyond maintenance tasks within Nav2 like reviewing PRs and issues, which does take up my time daily).

I just meant somehow clip the path reasonably so that it doesn't grow too high, but it sounds like what you have now would suffice. Later when we do more profiling maybe we find we can save some time here, but I wouldn't overoptimize for anything like that until we have evidence that it is a big contributor to CPU worth the time / effort.


I did a quick look through of the codebase after your refactors and had some questions / comments. This is not a list of things I expect you to do all of, I can jump in here and help too, and will plan to once I can get started on touching the code directly (hopefully later this week). This is mostly to get everything in 1 concise place to get started with.

Note that I did not critique deeply the implementation itself, this is more superficial stuff to hammer out first before the deep dive. But I did look over that stuff at length a couple of months ago so I don't think I'll have any major concerns as long as the changes are largely just refactors.

All in all, the refactor was done very well. I kind of liked the old version of MPPI because its simplicity of being largely all in 1 file, but we did need to split out the critics to make them plugins and some other of the changes lent themselves well for being spun out. I think you did a good job with that - I don't see anything you did as unreasonable. At most, I might suggest rolling the critic scorer back into the main optimization engine, but I think what you did makes sense as well.

@artofnothingness
Copy link
Owner Author

artofnothingness commented Mar 8, 2022

Why remove smart pointers? You have shared pointers coming in

Hmm. I'm not sure if i get it right. As far as i know controller server has ownership of costmap.
https://github.com/ros-planning/navigation2/blob/main/nav2_controller/include/nav2_controller/controller_server.hpp#L217
Plugins like mppi just use it. They don't manage it's lifetime.
I'm not sure i understand the reason to use here any memory ownership mechanism at all, according to core guidelines http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rr-sharedptrparam-const
Mb i'm wrong here.
Doesn't control server guarantee to keep reference to costmap and keep it alive during controller plugins work?

Also, in Nav2, we use WeakPtrs for the node object, which is important for shutdown logistics in the server -- we cannot hold a shared or raw instance of the node pointer, it needs to be the weak pointer so its useful to get out the logger / clock / other utilities within the node in configuration so that we only lock that resource when explicitly needed for ROS API network interfaces.

As far as i know weak_ptr is needed for checking if object still alive. And here in controller plugin i assume that Node always alive while controller server plugins uses it. What you mean by "shutdown logistics in the server" ? Mb some reference to the code will explain that to me better.

I think that would deprecate https://github.com/artofnothingness/mppic/blob/develop/include/mppic/utils/common.hpp this file

Removed helper function getParam, but getParamGetter still seems useful to me. With that util i don't need to keep track of declare/get parameters in different places and also repeatedly concatenate node names with params names

Its not entirely clear to me why this function is templated https://github.com/artofnothingness/mppic/blob/develop/include/mppic/handlers/PathHandler.hpp#L50 given its only used in 1 place with 1 set of types. I think that could be de-templated and then the implementation moved to the src directory

Removed templates there.

https://github.com/artofnothingness/mppic/blob/develop/include/mppic/Controller.hpp#L11 I only see T for float and double, does it make that much of a difference its worth templating this for both?

Probably it's not

I guess I'd like to understand why its important to have the option of either. That could also move the controller and critic implementation code(s) to the src directory if its unnecessary to have both
Initially i thought that Neural Network may need different input types, so i made batches templated by internal type, but now i think that if needed it can be just cast (double -> float). If so the most things could be moved to src

This move-move-move logic seems a little much

It may be redundant somewhere :) I'll take a look on this

https://github.com/artofnothingness/mppic/blob/develop/include/mppic/optimization/tensor_wrappers/State.hpp#L52 what is the c in these? I think the v is velocity, but I'm not sure what c is.

c is for control. It was done in kinda haste. It's requires better documentation/naming.

https://github.com/artofnothingness/mppic/blob/develop/include/mppic/optimization/OptimizerImpl.hpp#L61 I'm not entirely sure what temperature means in this context. I usually think of that as a variable name as a placeholder weight when its hard to describe the function its implementing. Is that from the MPPI literature or something specific to this implementation? From the readme, it says that 0 value uses the best control, is there a situation where we wouldn't want that? I think a little explanation is in order about what the trade off and important value of this is

It's from MPPI litrature. We have costs on each trajectory, and controls from each trajectory weighted respectively to those costs, this parameter defines how much we should consider those costs. Big value mean that we just take average controls from all trajectories and not considering costs at all. Honestly, i've experimented just a little with this. I think the main reason to this is to make result control sequence more smooth

https://github.com/artofnothingness/mppic/blob/develop/include/mppic/optimization/scoring/critics/ObstaclesCritic.hpp#L92 this is only actually true if using the circular assumption is in place. Otherwise, we need to check if its lethal inflated for full SE2 footprint checks.

i check cost from footprintCostAtPose (collision_checker) if bool variable set to true
https://github.com/artofnothingness/mppic/blob/develop/include/mppic/optimization/scoring/critics/ObstaclesCritic.hpp#L78-L90

https://github.com/artofnothingness/mppic/blob/develop/include/mppic/optimization/scoring/critics/ObstaclesCritic.hpp#L59-L67 this block is a little confusing with the three functions isFree inCollision and costAtPose. I think this can be cleaned up quite a bit.

I'll take a look on it

Thanks for your comments!

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Mar 8, 2022

With respect to the smart pointers and WeakPtr topic - I can go into more detail, but its essentially related to the startup and shutdown sequences and making sure there aren't any circular dependencies when memory is being deallocated with smart pointers to have a clean shutdown. WeakPtrs are useful when you need access to a resource, but you don't want it to be incrementing the reference counter internally when its not being utilized. This is why we pass in Nav2 the node as a WeakPtr to the plugins so they can store and lock it when they need it, but it doesn't create a circular dependency / non-zero reference count to destroy it when we're trying to deterministically shut down the servers containing N plugins.

We encountered that issue long ago and we had some work to retool our system in Nav2 around that WeakPtr and only storing it in the plugin (never the shared pointer locked resource) so that we can get clean program exits. We used to use shared pointers in the plugins and then we found that Control-C's were not giving clean exits because a shared pointer reset was being blocked by the ownership of a reference within a plugin that was reset.

With respect to the use of smart pointers vs .get() them for raw pointers, what you're being given are shared pointers as an input and it is a slight abuse of the smart pointers API to get the underlying raw pointer to store and use. There's a real argument to be made against what we do in passing around shared pointer references to other objects, and I think that's a worthwhile discussion actually to elevate to Nav2 directly to change. I think that was more of an oversight when we were working through another issue than an intentional choice.

Also, imagine a situation where we need to swap out pointers due to dynamic parameter reconfiguration the underlying pointer, then this plugin would be the only that would not get the updated info and its ptr would be deleted from the stack. Smart pointers have very little overhead and what they give you in behavior is well worth its use. If that's not a good enough reason, its just the code styling ROS 2 that Nav2 tries to comply with to be a "golden" example 😆. ROS 2 is heavily littered with smart, weak, and unique pointers and we rarely use raw pointers anymore. I think that's generally the direction of the C++ industry so I figure we're on the vanguard of hastening its use via example.

Removed helper function getParam, but getParamGetter still seems useful to me. With that util i don't need to keep track of declare/get parameters in different places and also repeatedly concatenate node names with params names

Sure, but it seems a little convoluted with the templates and such. We do a similar thing in the costmap layers using https://github.com/ros-planning/navigation2/blob/30b405c58e6d53ba8c96381416bc4679d35a1483/nav2_costmap_2d/src/layer.cpp#L78-L120 which is much more straight forward. There's not much reason to return this lambda for use, when we can just have the utility function itself just take in the node, namespace, and parameter type to grab.

It's from MPPI litrature.

Got it, yeah a little more exploration there may be warranted, if nothing else for my own education!

i check cost from footprintCostAtPose (collision_checker) if bool variable set to true

Sure, and that works there but in the inCollision function doesn't have the same interpretation of cost if you're using 2 methods of collision detection. If using point costs, then the inflated cost is indeed going to be occupied since you're relying on the inflation layer to have that cost be applied if within known collision states. But for the full SE3 footprint, you should only check for LETHAL cost since otherwise some point at the edge of the footprint may touch an inflated part, but that doesn't mean its in collision with an actual lethal cell. The inflated values are for point-checking only https://github.com/artofnothingness/mppic/blob/develop/include/mppic/optimization/scoring/critics/ObstaclesCritic.hpp#L92-L100

e.g.

  • Inflation is applied to obstacles, so if we check a center point-cost of the robot, its an approximate if its in collision based on the inflation layer's application (who's radius is based on the footprint's minimum internal radius)
  • If we check the SE2 footprint, we need to check the grid itself for literal collisions and the inflation information doesn't mean a whole lot to us because we're using the real data. If a corner of a robot is an inflated cost, that doesn't mean its in collision, that means if that point was the center of the robot that it would be in collision, but right now, we're only checking the corner point.

The affect of applying INFLATION cost as an obstacle when checking the SE2 footprint would be that trajectories in confined settings near, but not on, occupied cells would be marked as in collision rather than only having higher cost due to proximity to collision. Does that make sense? I know its a little confusing with words but a diagram helps alot
image

Example for the smac planner: https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/collision_checker.cpp#L93-L145 we have an if/else for if using the radius or SE2 footprint and we check if inflated or occupied as the threshold for collision, respectively. There are some other optimizations going on there that you don't need to do here. They are far more important for global planners that are searching hundreds of thousands of entries rather than low-thounsands.

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Mar 9, 2022

Starting to play around this afternoon. One thing I noticed is that this is licensed under MIT. This is perfectly acceptable, but I don't know if it would be possible to relicense under Apache 2.0. Apache 2.0 is what the other work in Nav2 is licensed under (with the notable exception of DWB created separately and AMCL from Player) so it would just help in trying to make things consistent. However, its really not a big deal either way, they're both very permissible. Apache just allows you, the author, to retain more patent rights should there be patents filed under the work (which would be odd, but certainly within your rights) and ask folks that fork and make modifications to retain a changelog. Pretty small stuff.

My only interest is in trying to long term make all of Nav2 Apache 2.0 just for consistency's sake. But MIT is even more permissive so I wouldn't argue if you wanted to keep it that way. I didn't know if that license was selected carefully or more on a whim that it wouldn't be objectionable to change slightly.

@SteveMacenski
Copy link
Collaborator

I opened a PR with today's work, let me know what you think. It's almost all just refactoring for simplicity (removing templates, moving files to src, etc) and I also updated the Controller API to meet the current ROS 2 Rolling API so I can build on my development system. I assume that's what you want from the devel branch, but if you opened a ros2 branch (eq. to master for ROS 2) you or I can retarget there and use that as our ROS 2 rolling development branch.

@artofnothingness
Copy link
Owner Author

artofnothingness commented Mar 9, 2022

Smart pointers are awesome without any doubt ) Approach where you lock weak ptr only when you use it definitely guarantees that while you use resource, it wouldn't be deleted. I think you better know what's more appropriate in nav2 in such case, i just tried to understand intention in case of plugins) However i think raw pointers still useful in some cases if it is guaranteed that object to which it refers remains alive. It's like you have Costmap2DROS that have getCostmap method which returns raw pointer, and this pointer is valid while LayeredCostmap inside Costmap2DROS is alive.

Sure, and that works there but in the inCollision function doesn't have the same interpretation of cost if you're using 2 methods of collision detection. If using point costs, then the inflated cost is indeed going to be occupied since you're relying on the inflation layer to have that cost be applied if within known collision states. But for the full SE3 footprint, you should only check for LETHAL cost since otherwise some point at the edge of the footprint may touch an inflated part, but that doesn't mean its in collision with an actual lethal cell. The inflated values are for point-checking only

Got it.

My only interest is in trying to long term make all of Nav2 Apache 2.0 just for consistency's sake. But MIT is even more permissive so I wouldn't argue if you wanted to keep it that way. I didn't know if that license was selected carefully or more on a whim that it wouldn't be objectionable to change slightly.

It was chosen while my work at FastSense team. Now i just fork it and develop it independently. I dunno if it possible to change it without asking folks from FS. If so, i don't really care.

I opened a PR with today's work, let me know what you think. It's almost all just refactoring for simplicity (removing templates, moving files to src, etc) and I also updated the Controller API to meet the current ROS 2 Rolling API so I can build on my development system. I assume that's what you want from the devel branch, but if you opened a ros2 branch (eq. to master for ROS 2) you or I can retarget there and use that as our ROS 2 rolling development branch.

I took a look! That's awesome! Sure, dev branch is nice place to rolling stuffs

@SteveMacenski
Copy link
Collaborator

It's like you have Costmap2DROS that have getCostmap method which returns raw pointer, and this pointer is valid while LayeredCostmap inside Costmap2DROS is alive.

Admittedly, I hate that we do that, but the API calls for costmap_ros_->getCostmap()->getResolution() (or other methods) just makes the code less readable and often will make simple calls drag onto multiple lines. Think of that as an exception for readability more than a model.

I'll take care of reverting the smart pointer use today / tomorrow. Its kind of necessary anyway to deal with the Node input properly and grab utilities like loggers to use from the Node vs instantiating them for each component independently.

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Mar 10, 2022

During my refactor it wasn't immediately clear to me how to categorize or think about the ControlSequence and State. They are clearly different, but from a quick look while refactoring it wasn't immediately clear to me the role of each. That might be something good to add to the readme file or in doxygen comments in their respective struct definitions. I'm sure if I sat down and looked at it, its clear in use, but some context would be good to add so its more immediately understandable

It also seems to me like the StateModel and the MotionModel are in fact containing related content. Would it be possible to refactor these together? I'm thinking maybe of a MotionModel base class with implementations for Diff/Omni/Ackermann that contains metadata like if its holonomic and a function that can be called to project forward in time (e.g. the NaiveModel for now). I think the base classes project forward function could be the NaiveModel implementation by default that is then overridable by individual motion models if they would like (for instance: ackermann).

Would that be something you could work on?


Otherwise, I finished adding smart pointers and a light refactor. I would like to have those 2 models combined and have that output file added to include/mppic but otherwise the rest of this looks good now from a directory standpoint to be as flat as reasonably possible.

Tomorrow, I'll work on updating the namespaces of the files for the refactor, working on propagating loggers from the high level controller node, and adding proper headers to each of the files with copyright / ifdefs. That'll essentially be the end to the surface level refactoring and then its onto more detailed look at the API (which overall seems reasonable, only minor tweaks for copies I suspect) and adding more logging.

@artofnothingness
Copy link
Owner Author

artofnothingness commented Mar 10, 2022

Admittedly, I hate that we do that, but the API calls for costmap_ros_->getCostmap()->getResolution() (or other methods) just makes the code less readable and often will make simple calls drag onto multiple lines. Think of that as an exception for readability more than a model.

In fact, I feel the same here ) Point was about memory management technique (costmap2d not passed as a smart pointer)

During my refactor it wasn't immediately clear to me how to categorize or think about the ControlSequence and State. They are clearly different, but from a quick look while refactoring it wasn't immediately clear to me the role of each. That might be something good to add to the readme file or in doxygen comments in their respective struct definitions. I'm sure if I sat down and looked at it, its clear in use, but some context would be good to add so its more immediately understandable

Sure. I'll have more time closer to the weekends to start helping with refactoring stuffs.

It also seems to me like the StateModel and the MotionModel are in fact containing related content. Would it be possible to refactor these together? I'm thinking maybe of a MotionModel base class with implementations for Diff/Omni/Ackermann that contains metadata like if its holonomic and a function that can be called to project forward in time (e.g. the NaiveModel for now). I think the base classes project forward function could be the NaiveModel implementation by default that is then overridable by individual motion models if they would like (for instance: ackermann).

Would that be something you could work on?

I had the same thoughts about it. Yeah, sure, i could try to rethink these stuffs

@artofnothingness
Copy link
Owner Author

artofnothingness commented Mar 10, 2022

Created PR for critics untemplating. Have problem with costmap2d passing it in Optimizer initialize method in optimizer_test.cpp. Dunno why but it fails in shared_ptr copy constructor.
Sent you collaborators invite, mb it helps to add you to reviewers

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Mar 10, 2022

Yeah, I'm periodically running into build errors about things I long ago changed that are only showing up now. I haven't used conan before but perhaps its not very good at breaking the cache when required? Ran into that linking problem with the trajectory visualizer and a renamed header in the tests that only now picked up as problematic

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Mar 10, 2022

After we merge in the existing PRs to completely eliminate the templates, I'm going to typedef xt::xtensor<double, 1> xt::xtensor<double, 2> and xt::xtensor<double, 3> across the project so that we can work with these as named-types of interest.

Can you tell me for each of those 3 what they are used to hold? That way, I can use that to name them appropriately and distribute to all files. greping for xt::xtensor< shows the uses of it across the files. If there isn't consistent data being represented in each type, I can also just typedef them as Tensor1D Tensor2D and Tensor3D, but I suspect each has their own specific usecase that we can leverage. (e.g. 3D = controls, 2D = paths, 1D = costs, or something).

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Apr 28, 2022

Replace xtensor basically mean rewrite everything xD. But i think it could brings essential advantages

Agreed, after the parameters were updated, I was seeing that this wasn't quite able to run at 20hz so if moving to another library could get us some real improvement, that would be significant. I think its worthwhile. You were reporting 50hz on a 4th gen intel CPU before tuning and after tuning I couldn't quite get 20hz reliably on a 8th gen i7 (8565U).

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented May 2, 2022

I commented on #42 with a solution to that problem

At this point, I'm a little at a loss of how I can help to continue to push this to completion. Let me know if there's something I can do to help or how we can get this over the last bit.

by the way, there's a nav2 contributor that's playing with using flashlight for xtensor replacement here: https://github.com/mkolodziejczyk-piap/mppic/tree/libtorch. That wasn't on your comparison document for replacements, so maybe that's a good option. But I still agree from a purely technical perspective, I prefer Eigen. Its stable, available in binary form everywhere, and very performant. Just wanted to let you know in case something pans out of that work. He's doing this so he can use MPPI with a GPU

@artofnothingness
Copy link
Owner Author

artofnothingness commented May 3, 2022

That's nice, I didn't have time to watch the whole thing recently, i'll get to this soon.

At this point, I'm a little at a loss of how I can help to continue to push this to completion. Let me know if there's something I can do to help or how we can get this over the last bit.

Ok

by the way, there's a nav2 contributor that's playing with using flashlight for xtensor replacement here: https://github.com/mkolodziejczyk-piap/mppic/tree/libtorch. That wasn't on your comparison document for replacements, so maybe that's a good option. But I still agree from a purely technical perspective, I prefer Eigen. Its stable, available in binary form everywhere, and very performant. Just wanted to let you know in case something pans out of that work. He's doing this so he can use MPPI with a GPU

I had thoughts to use flashlight as an tensor replacement to use gpu and trained model. Quick look on the repo, i notice branch with torch. As far as i know flashlight uses arrayfire as tensor lib. Since I'm just getting started of xtensor replacement work i could consider different alternatives (including arrayfire/torch) . But torch definitly wouldn't be much performant with CPU

@mkolodziejczyk-piap
Copy link

Hi,
Yes, I'm working on two ports (1) libtorch (2) arrayfire/flashlight. Focusing on GPU acceleration. For (1) I'll be investinating how smoothly torch JIT can be integrated in C++ code, and also is it possible to use functorch. For (2) I'm starting with pure arrayfire. Seem like arrayfire JIT is not sth "controllable", so it's little bit trial&error to check what's optimal, same goes with vectorization. BTW there's an arrayfire plugin for costmap2d (https://github.com/lsi-uc3m/hypergrid), although I'm more into writing GPU version of costmap_converter from teb to have geometric primitives. I'll try to optimize mainly those for.. for.. for.. loops like in reference trajectory cost.

@artofnothingness
Copy link
Owner Author

@mkolodziejczyk-piap
Hi! Sounds interesting! Reference critic is main performance anchor here, surely..

@SteveMacenski
Copy link
Collaborator

I think my hope from the benchmark of tensor libraries is that eigen would be fast enough GPUs would largely not be required + Eigen is an easy to install thing and version control.

If it makes most sense to use on of these other libraries, I can look into how we can manage that dependency for binary release, but im not sure its possible since I’m not sure how the build farm would build code for all the GPU vendors to be able to release the package to apt

@artofnothingness
Copy link
Owner Author

I also strongly lean towards Eigen for now.
I would suggest focusing on solid behavior on CPU first (more safe obstacle critic/more lightweight reference critic, more smooth controls, use eigen)

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented May 4, 2022

By the way, another user is going to beta test on their omni platform, so hopefully they can give us some tuned omni parameters based on the differential ones tested on hardware to take that off our agendas

@artofnothingness
Copy link
Owner Author

artofnothingness commented May 8, 2022

@mkolodziejczyk-piap i made optimizer part itself a plugin in art branch. probably you will be interested, and probably it will be helpful to not rewrite everything, but extend current implementation

Still in alpha (need to rename some namespaces and classes), but works for me
dependency required https://github.com/kokkos/mdspan

@artofnothingness
Copy link
Owner Author

artofnothingness commented May 8, 2022

@SteveMacenski i don't remember if we discuss that. Can we use pragmas instead of header guards in mppi. It's really annoying without auto formatters to refactor them.

I place all xtensor stuffs in optimizer/xtensor and now header guards must be two lines long

@artofnothingness
Copy link
Owner Author

i found a way to use catch2 with ament. Rewriting tests on catch2 + add catch benchmarks

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented May 9, 2022

Can we use pragmas instead of header guards in mppi. It's really annoying without auto formatters to refactor them.

We're going to have to change them back before moving to Nav2, so I'd prefer not to remove them, but if you didn't keep updating them for now that would be fine and we can deal with the linting errors before moving into Nav2.

i found a way to use catch2 with ament. Rewriting tests on catch2 + add catch benchmarks

Make sure its not just using it, but also that the dependencies are properly managed so that it can be installed with rosdep or else we cannot release this to apt. If you can build in a docker container only using rosdep for dependency management, then its OK (e.g. no source builds; no manual calls to install; etc).

@artofnothingness
Copy link
Owner Author

Make sure its not just using it, but also that the dependencies are properly managed so that it can be installed with rosdep or else we cannot release this to apt. If you can build in a docker container only using rosdep for dependency management, then its OK (e.g. no source builds; no manual calls to install; etc).

https://index.ros.org/r/ament_cmake_catch2/
I had some problems with rosdep with this package, but manually sudo apt install-ros-rolling-ament_cmake_catch2 did the job

Catch2 itself not in 20.04 as well as xtensor, but it packaged in newer releases as far as i know

@SteveMacenski
Copy link
Collaborator

What kind of rosdep issues? It should just work if you add ament_cmake_catch2 to the package.xml file as a depend

@artofnothingness
Copy link
Owner Author

artofnothingness commented May 9, 2022

What kind of rosdep issues? It should just work if you add ament_cmake_catch2 to the package.xml file as a depend

rosdep can't find the package, but the package exists. I'll investigate this further

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented May 12, 2022

By the way, I'm aiming for June 30th to get MPPI in Nav2 ideally. I have quite a bit of travel after July and other work that would need to take priority (conference talks, papers, etc) until early November. I'm just working on some scheduling today so that came as a conclusion just now

@artofnothingness
Copy link
Owner Author

Seems like major parts that left - performance optimization of reference critic and obstacle critic if possible, ackermann constraints, omni params.

@SteveMacenski
Copy link
Collaborator

In the last PR we moved to 2 iterations from 1, did the optimizations for the path make that possible or just something we forgot from the parallelization work?

I think other than that, the last bits are (1) smoother controls and (2) hardware testing / evaluation (maybe tuning defaults)

@artofnothingness
Copy link
Owner Author

for my machine it sometimes exceed 2 iterations so i would prefer to stay on 1 iteration till some parallelization work done

@artofnothingness
Copy link
Owner Author

artofnothingness commented Jun 10, 2022

Also more then one iteration make it possible to choose velocity multiple to max_v max_w. We limit velocities by sampling, so sampling twice make it possible to exceed those limits. I think if we need better trajectory at some point we could increase batch size and not iteration count

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Jun 10, 2022

OK makes sense to go back to 1 iteration for now. I think though it is crucial for this approach to use multiple iterations so we can converge to a refined solution. It very well might be that because we don't that is why the controls aren't smooth or could be wobbly when called at its 20hz (or whatever rate). If we evaluate only a single batch of any arbitrary size small or large, we basically are just finding the best among a random sampling. Repeatedly iterating over the previous best one adds the smoothing characteristics so that we converge to the best solution rather than just being in generally the right neighborhood of solutions.

IMO from reading MPC and MPPI literature in the last couple of weeks, multiple iterations are pretty important to handle the random sampling and non-linearity of the system. To go from the 'general solution area' to 'the solution' needs iterations of sampling in that localized space to improve the result. The output would then probably be smoother and definitely more refined.

Arguably for iterations after the first, it might even be wise to reduce the noise sampling std so that we focus in more and more on the area of interest. The first iteration gets us the neighborhood, future iterations really hone in. I don't think I've read about that being done anywhere in MPPI papers, but seems like an obvious thing to try. Kind of analog to reducing the learning rate.

we limit velocities by sampling, so sampling twice make it possible to exceed those limits

generateNoisedTrajectories() is iterated over iteration_count times, which itself contains the control generation and constraint application. So I think for each iteration, everything should be following the limits. It looks to me in this formulation looping N times for iterations does the same thing as calling the function N times as its being executed at the run rate.

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Jun 17, 2022

I ran the controller with valgrind to find the hotspots and came with the following result
callgrind.out.zip

If you open it with kcachegrind (e.g. kcachegrind callgrind.out.4511) and go to "call graph" you can see where the compute time is spent visually. Interestingly, the obstacle critic is where alot of time is spent, doing footprint collision checks, about 39% of the total compute time, that is when consider_footprint is true.

When consider_footprint is false, it drops to only 3%. I think we should make this default false. It also might be worth considering this for multi-threading optimizations given how significant the compute overhead is.

Generating the noised controls and integrating the state velocities also took a higher chunk of time than I suspected (6% and 11%, respectively). When the obstacle critic was set to use the circular assumption, the entire critic trajectory scoring took about 31% of the time and generateNoisedTrajectories() (and its inner functions) took 41% of the time, which did very much surprise me that the trajectory creation could possibly exceed critic evaluation. See below callgrind file for that. Hopefully there's some improvement we can do there?

callgrind.out2.zip

When the obstacle critic is low from using the point-cost, now the path alignment critic is the single largest hog of CPU time at 21%.

With circular footprint checking, MPPI can run at 20hz with 5 iterations, smaller model_dt, and a larger number of timesteps (to account for smaller dt's). We can tune with this to get the best behavior, but it would be good to find a way to reduce the cost of those collision checks. One example is https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/collision_checker.cpp#L103-L108 where I check if the point cost at the center of the robot is greater than possibly inscribed and only do the full footprint checking if there's a possibility that some part of it is in collision. Else, just use the center cost since we can calculate the cost the center would be if any single part of the footprint was in collision. That should help greatly

Unsurprisingly, when I ran with valgrind it was very slow, but that was illustrative actually. The MPPI controller was very wobbly / drunken which reminds me that maybe part of the wobbly issue we have is that perhaps we're running too slow? (another hypothesis to test along with critic smoother + iteration count increase)

If you're curious how I did this with nav2, I wrote a tutorial about it: ros-navigation/docs.nav2.org#328 since I know someone will ask me about this someday

tl;dr

  • We need to optimize footprint collision checking. In parallelization if possible, but checking center costs to see if its necessary to check the full footprint would help alot when it is enabled.
  • Change default to false for consider_footprint
  • Find a way, if possible, to better optimize the generateNoisedTrajectories process (e.g. integrateStateVelocities at 21%, generateNoisedControls at 13%)
  • Find a way, if possible to better optimize path align critic (21%)
  • If we set consider_footprint false, we may be able to increase iterations or control rate or model_dt/future time steps. @padhupradheep Its worth testing each independently on hardware (modify 1 at a time) to see what helps with stability / jerky behavior (if any). If none, then its a good reason to consider a smoothing critic or see if there's something going wrong here we didn't pick up on before.
  • Feel free to analyze those valgrind outputs to see if you pick up on any other interesting / concerning trends

@SteveMacenski
Copy link
Collaborator

SteveMacenski commented Jun 18, 2022

Separately without -g compiler flags, I did a timer on the controller server side to measure the total runtime on MPPI, I agree I'm seeing between 5-9ms on a typical run. It jumps to 15-20ms when a new path comes in, which still seems totally fine.

I see if I push it with dt of 0.05 and 30 time steps (so still 1.5s ahead) with the same 300 batch size I could get 3 iterations within 20hz. If we dropped the batch size obviously we can play with the iteration count. So I think its worth doing a bit of optimization on the biggest offenders, but we're on the 'right side' of the minimum required performance. If we can improve the path align, full footprint critic, generating noised controls, and integrating state velocities, then I think that's the most optimization that we'd ever had to do.

The footprint critic I've outlined a clear way to help that (will submit PR next week when I get a chance to implement it), with that much optimization I don't think any more will be required (but CPU will get higher when in closer proximity to obstacles since it'll need to do full footprint checks for more trajectories). The other three there aren't as clear algorithmic shortcuts we can make, but hoping you might have some ideas 😄

@artofnothingness
Copy link
Owner Author

artofnothingness commented Jun 18, 2022

You did a great job, Thanks!

What environment do you use for testing ?
Is it humble - jammy ?
I have errors building nav2 main branch on foxy. And on jammy there is some problems with fastdds that you mentioned

@SteveMacenski
Copy link
Collaborator

I tested on Rolling, the issues from Jammy exist there too, but I added the static_layer to the local costmap so I could collect some data since the actual data processing isn't super important if we localize well to start off so we don't straight up run into anything.

@SteveMacenski SteveMacenski mentioned this issue Jul 5, 2022
@SteveMacenski
Copy link
Collaborator

I think the existing open tickets cover the remaining items for a Nav2 release sufficiently now. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants