__ccdGJK in narrowphase has bugs #366

hongkai-dai · 2019-01-30T23:57:32Z

Problem

When computing the signed distance in the narrow phase convexity based algorithm, FCL first calls __ccdGJK function to determine if the object A and B are separated. This code has several bugs

There are 3 possible outcomes from this function.
- A separating plane between A and B are found. Hence A and B are not intersecting.
- The origin is in the Minkowski difference A ⊖ B. Hence A and B are intersecting.
- We reached the iteration limit without determining whether A and B are intersecting or not.
  On the other hand, there are only two return values of this function.
- When A and B intersects, return 0
- Otherwise, return -1.
  Hence the FCL code doesn't differentiate between separating and iterations limit. We should improve the return values to make that difference.

__ccdGJK lies when it returns "no intersection" in the following lines

fcl/include/fcl/narrowphase/detail/convexity_based_algorithm/gjk_libccd-inl.h

Lines 1396 to 1398 in 9082fd2

    
           if (ccdIsZero(ccdVec3Len2(&dir))){ 
        
             return -1; // intersection not found 
        
           }

Here when |dir| = 0, since dir = -v (where v is the point that is closest to the origin in the simplex), this means that v = 0, and hence the origin is within the simplex, hence the origin is within A ⊖ B. This means A and B are intersecting. FCL returns the wrong code.

When calling doSimplex2 to compute the closest point within the new simplex (line segment AB) to the origin O, FCL also checks if the origin is on the line segment AB. The check it does is

fcl/include/fcl/narrowphase/detail/convexity_based_algorithm/gjk_libccd-inl.h

Lines 245 to 252 in 9082fd2

    
           // dot product AB . AO 
        
           dot = ccdVec3Dot(&AB, &AO); 
        
           // check if origin doesn't lie on AB segment 
        
           ccdVec3Cross(&tmp, &AB, &AO); 
        
           if (ccdIsZero(ccdVec3Len2(&tmp)) && dot > CCD_ZERO){ 
        
             return 1; 
        
           }

Namely if AB x AO = 0 and AB.dot(AO) > 0, then the code thinks that O is on the line segment AB. I think this check is wrong. It could be that O is on the ray shooting from A to B, but is beyond B. Hence this O satisfies the check in FCL, but it is not on the line segment AB.

There are many places that FCL calls ccdIsZero(dist), where dist is actually the squared distance. What ccdIsZero(dist) checks is abs(dist) < eps. But this means that the distance < 1E-8 will be regarded as 0. This tolerance is too loose. We should compare the squared distance with the squared machine epsilon.

Proposed fix

Rewrite __ccdGJK to fix problem 1 and 2. I have a dev branch trying to tackle the problem. I renamed the function __ccdGJK to separatingAxisGJK. Also I created an enum class as the return value
```
enum class SeparatingAxisGJKStatus {
  kSeparated,
  kIntersecting,
  kIterationsLimit,
};
```
Shall we agree that the return value has the following meaning
1. Separated: A ∩ B = ∅. Namely the signed distance between A and B is strictly positive.
2. Intersecting: A ∩ B ≠ ∅. Namely the intersection can be either on the boundary (touching contact), or in the strict interior (penetrating).
Fix problem 3. Fix the bug in doSimplex2, with proper unit tests (@SeanCurtis-TRI) (Reworking gjk_libccd doSimplex2() #367)
- This will be used to do a few things:
  - Locally rephrase the tests in Eigen
  - Add unit tests
  - Fix the bug (obviously)
  - NOTE: it wasn't a "bug" per se. The "fix" was a) document the problem clearly so that the apparent "bug" was resolved and b) resolve some numerical issues.
I haven't looked into doSimplex3 and doSimplex4 yet. I suspect there can be bugs. At least we should add unit tests for these functions.

cc @SeanCurtis-TRI @sherm1 @DamrongGuoy

The text was updated successfully, but these errors were encountered:

hongkai-dai · 2019-01-31T00:14:30Z

For the roadmap, I would propose that we verify the correctness of doSimplex2, doSimplex3 and doSimplex4 first, before we merge any change to __ccdGJK. Since __ccdGJK calls doSimplex code, I would prefer to fix the code from bottom up. We should add proper unit tests on doSimplex.

For better readability, can we change the return code of doSimplex2/3/4 to an enum class instead of integers? I would propose

enum class DoSimplexStatus {
  kIntersecting,  // Origin is in the simplex, hence A and B intersects
  kSeparated,    // When building the simplex, we find a plane that separates the origin from the Minkowski difference A ⊖ B. Hence A and B are separated.
  kUnknown,
};

sherm1 · 2019-01-31T01:05:09Z

Thanks, @hongkai-dai ! Assigned to @DamrongGuoy and @SeanCurtis-TRI while we figure out who does what.

sherm1 · 2019-01-31T01:05:56Z

cc @avalenzu

avalenzu · 2019-01-31T13:22:15Z

Shall we agree that the return value has the following meaning

Separated: A ∩ B = ∅. Namely the signed distance between A and B is strictly positive.

Intersecting: A ∩ B ≠ ∅. Namely the intersection can be either on the boundary (touching contact), or in the strict interior (penetrating).

Just to confirm, EPA will work if called for touching contact? Or will we need to have an additional test before calling EPA?

SeanCurtis-TRI · 2019-01-31T14:32:06Z

We have much higher confidence in the current EPA algorithm. Hongkai has done sterling work there (although there are still a couple of TODOs. BUt, at the very least, it's well tested.

SeanCurtis-TRI · 2019-01-31T14:33:41Z

On another note:

This may be the opportunity to disentangle ccd from this GJK implementation. The C-style interface for all of the ccd types obfuscates the math and code. We might want to investigate the viability of removing those structs from the core math. The code becomes easier to read which increases the odds of being correct.

hongkai-dai · 2019-01-31T16:26:43Z

This may be the opportunity to disentangle ccd from this GJK implementation

Good call. That could be a very big project. Maybe we should open a separate issue? It would be good if we can create an API design doc before we start to code, so that we agree what the new types will look like.

Currently we use a lot of ccd simplex (ccd_simplex_t) and ccd polytope (ccd_pt_t) struct. They use the old C style code, and they don't store some important information (for example, the normal vector pointing outward from each face in the polytope).

DamrongGuoy · 2019-01-31T19:24:00Z

For the roadmap, I would propose that we verify the correctness of doSimplex2, doSimplex3 and doSimplex4 first, before we merge any change to __ccdGJK. Since __ccdGJK calls doSimplex code, I would prefer to fix the code from bottom up. We should add proper unit tests on doSimplex.

For better readability, can we change the return code of doSimplex2/3/4 to an enum class instead of integers? I would propose
enum class DoSimplexStatus {
  kIntersecting,  // Origin is in the simplex, hence A and B intersects
  kSeparated,    // When building the simplex, we find a plane that separates the origin from the Minkowski difference A ⊖ B. Hence A and B are separated.
  kUnknown,
};

In principle, doSimplex*() is a point-location problem. It could be useful in other applications.

Please let me give an example in tetrahedron mesh refinement. Given a new point to insert into a tetrahedron mesh, we need to check whether the new point is strictly inside an edge, or strictly inside a triangle, or strictly inside a tetrahedron. This way we avoid creating new zero-volume tetrahedrons when we connect the new point to surrounding elements. (The wrong classification will make us connect the new point to form zero-area triangles or zero-volume tetrahedrons with other points).
In such an application, the general return enum class would be:

enum class SimplexLocation {
    kInside,       // The point is strictly inside the simplex.
    kBoundary,  // The point is on the boundary of the simplex.
    kOutside      // The point is strictly outside the simplex.
}

The code will first check the point p against a volumetric tetrahedron V. If p is on the boundary of V, then the code checks p against triangular faces F of V. If p is on the boundary of F, the code checks p against the edges E of F. (If p is on the boundary of E, then p co-locates with an existing point, and we abort the point insertion.)

I'm not suggesting that we use SimplexLocation as an alternative to DoSimplexStatus. I'm thinking more about the lower-level function can return SimplexLocation and the higher-level function interprets it to DoSimplexStatus.

My idea is doSimplex* might be useful outside GJK. On the other hand, I see doSimplex* are static functions, so they might not be public soon.

DamrongGuoy · 2019-01-31T19:32:23Z

I meant C static function inside gjk_libccd-inl.h.

namespace fcl {
namespace detail {
namespace libccd_extension {
static int doSimplex*(...)
}}}

SeanCurtis-TRI · 2019-02-04T18:30:10Z

I finally feel prepared to respond to @DamrongGuoy's interesting point.

Initially, I agreed with your assessment: that these are general functions that could/should be made more generally available. However, after working with doSimplex2() and glimpsing at doSimplex3(), I no longer subscribe to that view.

These functions live in the context of the full GJK ecosystem. Each simplex has particular properties that can be exploited to simplify the work being done. (See PR #367 for details.) It would be irresponsible to not exploit these properties in the GJK context and these properties disqualify the functions from being general point-location problems.

sherm1 assigned SeanCurtis-TRI and DamrongGuoy Jan 31, 2019

sherm1 added the Priority: high label Jan 31, 2019

SeanCurtis-TRI mentioned this issue Feb 4, 2019

Reworking gjk_libccd doSimplex2() #367

Merged

hongkai-dai mentioned this issue Apr 9, 2019

When libccd reports the nearest feature is an edge, validate the result #388

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

__ccdGJK in narrowphase has bugs #366

__ccdGJK in narrowphase has bugs #366

hongkai-dai commented Jan 30, 2019 •

edited by SeanCurtis-TRI

Loading

hongkai-dai commented Jan 31, 2019

sherm1 commented Jan 31, 2019

sherm1 commented Jan 31, 2019

avalenzu commented Jan 31, 2019

SeanCurtis-TRI commented Jan 31, 2019

SeanCurtis-TRI commented Jan 31, 2019

hongkai-dai commented Jan 31, 2019 •

edited

Loading

DamrongGuoy commented Jan 31, 2019

DamrongGuoy commented Jan 31, 2019

SeanCurtis-TRI commented Feb 4, 2019

__ccdGJK in narrowphase has bugs #366

__ccdGJK in narrowphase has bugs #366

Comments

hongkai-dai commented Jan 30, 2019 • edited by SeanCurtis-TRI Loading

Problem

Proposed fix

hongkai-dai commented Jan 31, 2019

sherm1 commented Jan 31, 2019

sherm1 commented Jan 31, 2019

avalenzu commented Jan 31, 2019

SeanCurtis-TRI commented Jan 31, 2019

SeanCurtis-TRI commented Jan 31, 2019

hongkai-dai commented Jan 31, 2019 • edited Loading

DamrongGuoy commented Jan 31, 2019

DamrongGuoy commented Jan 31, 2019

SeanCurtis-TRI commented Feb 4, 2019

hongkai-dai commented Jan 30, 2019 •

edited by SeanCurtis-TRI

Loading

hongkai-dai commented Jan 31, 2019 •

edited

Loading