Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constrained smoother failing with duplicate path pose #4755

Closed
LinusTxtonomy opened this issue Nov 19, 2024 · 5 comments
Closed

Constrained smoother failing with duplicate path pose #4755

LinusTxtonomy opened this issue Nov 19, 2024 · 5 comments

Comments

@LinusTxtonomy
Copy link

LinusTxtonomy commented Nov 19, 2024

Bug report

Required Info:

  • Operating System:
    • Debian 12
  • ROS2 Version:
    • humble
  • Version or commit hash:

Steps to reproduce issue

Any goal with a heading that triggers this issue:
#4596
and then using the constrained smoother on the path

Expected behavior

A normal smoothed path

Actual behavior

[smoother_server-8] [INFO] [1731936082.959181435] [smoother_server]: Received a path to smooth.
[smoother_server-8] WARNING: Logging before InitGoogleLogging() is written to STDERR
[smoother_server-8] W20241118 14:21:22.984320 497746 residual_block.cc:129]
[smoother_server-8]
[smoother_server-8] Error in evaluating the ResidualBlock.
[smoother_server-8]
[smoother_server-8] There are two possible reasons. Either the CostFunction did not evaluate and fill all
[smoother_server-8] residual and jacobians that were requested or there was a non-finite value (nan/infinite)
[smoother_server-8] generated during the or jacobian computation.
[smoother_server-8]
[smoother_server-8] Residual Block size: 3 parameter blocks x 4 residuals
[smoother_server-8]
[smoother_server-8] For each parameter block, the value of the parameters are printed in the first column
[smoother_server-8] and the value of the jacobian under the corresponding residual. If a ParameterBlock was
[smoother_server-8] held constant then the corresponding jacobian is printed as 'Not Computed'. If an entry
[smoother_server-8] of the Jacobian/residual array was requested but was not written to by user code, it is
[smoother_server-8] indicated by 'Uninitialized'. This is an error. Residuals or Jacobian values evaluating
[smoother_server-8] to Inf or NaN is also an error.
[smoother_server-8]
[smoother_server-8] Residuals:             -nan            0            0      667.778
[smoother_server-8]
[smoother_server-8] Parameter Block 0, size: 2
[smoother_server-8]
[smoother_server-8]    0.0925396 | Not Computed  Not Computed  Not Computed  Not Computed
[smoother_server-8]     -208.843 | Not Computed  Not Computed  Not Computed  Not Computed
[smoother_server-8]
[smoother_server-8] Parameter Block 1, size: 2
[smoother_server-8]
[smoother_server-8]    0.0925396 | Not Computed  Not Computed  Not Computed  Not Computed
[smoother_server-8]     -208.843 | Not Computed  Not Computed  Not Computed  Not Computed
[smoother_server-8]
[smoother_server-8] Parameter Block 2, size: 2
[smoother_server-8]
[smoother_server-8]     -1.30546 |         -nan            0            0            0
[smoother_server-8]     -207.879 |         -nan            0            0            0
[smoother_server-8]
[smoother_server-8]
[smoother_server-8] E20241118 14:21:22.984406 497746 trust_region_minimizer.cc:71] Terminating: Residual and Jacobian evaluation failed.
[smoother_server-8] [WARN] [1731936082.984575316] [smoother_server]: constrained_smoother: failed to smooth plan, Ceres could not find a usable solution to optimize.
[bt_navigator-14] [ERROR] [1731936110.176972226] [bt_navigator_navigate_through_poses_rclcpp_node]: Failed to get result for smooth_path in node halt!

Additional information


I already debugged this and the issue is that the smoother crashes on duplicate poses because the distance between two poses is used in the smoothing cost as a division. Here the current segment length is used in a division while it can be 0 for duplicate poses. This happens in all branches, not just humble as far as I am aware. The duplicate poses should not be there to begin with so I looked at the smac_hybrid code and found the indexing to be the issue. Here the analytic expansion gets the index for the node that the analytic expansion goes through. This index in some cases fails to retrieve the goal pose because of the angle inaccuracy. When that happens the index next to it is wrongly set and this causes the last node of the expansion to be set as the parent of the goal pose even though their poses are identical. This seems to be fixed when backporting #4636 to the humble branch but I am a bit sceptical if this is a full fix or just covers it better than before.

Feature request

Feature description

This should be fixed in both the smoother and the smac planner. The smoother should check for duplicate poses and throw an error as the error message for this case is currently very cryptic compared to how simple the error is. On the other hand the problem of having the last pose duplicated by the smac planner should be fixed and probably will be when this change is backported to the humble branch.

Implementation considerations

The smoother just needs a check for the distance between poses and throw a meaningful error message. The smac planner needs the linked PR backported to the humble branch. I am not sure if this fully fixes the issue though as I am not 100% sure what is exactly happening and if this might have just been an accidental fix that doesnt fully cover the problem. I think another cause for this might be that there are a lot of implicit conversions. Most worrying for me are the ones between unsigned int and float/double. Fixing those is a fairly big task as compiling with -Wconversion -Werror throws a lot of errors but this might be part of what causes this error.

@SteveMacenski
Copy link
Member

This seems to be fixed when backporting #4636 to the humble branch but I am a bit sceptical if this is a full fix or just covers it better than before.

Can you manually backport that to humble and submit a PR? Regardless, I'm happy to merge that in as a fix which moves the ball forward.

The way I read this is that this is seemingly a full fix. The end-point of the path was incorrectly indexed due to the binning that PR resolves -- so we add in the extra point to have the exact goal criteria met. If the goal binned into the search space is correct, then we wouldn't have the extra point. If you see it fixing this problem and the problem is related to the end point only, I'm convinced this is a full solution.

The smoother just needs a check for the distance between poses and throw a meaningful error message

We do catch points that are overlapping using the is_cusp in the smoother in the general path, but we have a cut out for the very last point which has it skipped

This is more evidence to me that the backport fix is a full fix :-)

Most worrying for me are the ones between unsigned int and float/double

Where do you see implicit unsigned int and float/double conversions? We're reasonably careful about those globally in Smac.

@LinusTxtonomy
Copy link
Author

Where do you see implicit unsigned int and float/double conversions? We're reasonably careful about those globally in Smac.

There are a few that throw errors but the one that is a bit worrying to me is this for example:

from[2] = node->motion_table.getAngleFromBin(node->pose.theta);

node->pose->theta is float but getAngleFromBin takes unsigned int.
Also here it is a float but used as an index:
delta_xs[i][node_heading] + node->pose.x,

I can create a PR for this. If I backport more features to the humble branch is there a template for creating backport PRs?

@SteveMacenski
Copy link
Member

There's no specific template, just open them with the eq. main PR and we can merge the pair. I think there's enough deviation between humble and main that even small fixes like that would cause issues with our auto-backporting tools, so I'm perfectly OK with the manual backport. I will be able to auto-backport on jazzy easily however :-)

@LinusTxtonomy
Copy link
Author

Opened backport pull request #4760

@SteveMacenski
Copy link
Member

Merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants