Tests are brittle. Use the concept of in_delta #565

martinfrances107 · 2020-12-08T19:10:33Z

When I say brittle ... I mean for example

look at

    fn test_rotate_line() {
        let line0 = Line::from([(0., 0.), (0., 2.)]);
        let line1 = Line::from([(1., 0.9999999999999999), (-1., 1.)]);
        assert_eq!(line0.rotate(90.), line1);
    }

the 0.9999999999999999 is a hard coded rounding artifact -- the expected value is 1.0.

numerically two numbers x0 and x1 are considered equal if

(x0 - x1).abs() < T::EPSILON

I am coming from a related d3 "geo-spatial" library and they have a in_delta concept.
d3-geo the in_delta() function

In short a test helper function which compares objects on a point by point basis and asserts that two objects are identical if every thing is "in delta"

Anyway I am going to work on this.

The text was updated successfully, but these errors were encountered:

michaelkirk · 2020-12-08T20:23:59Z

Most places we try to use the approx crate when comparing floats - I'd recommend basing any approach on that for uniformity.

(as a side note, I think most of the time we actually want a "relatively equals" check vs a "within an epsilon" check, but that's somewhat pedantic.)

I haven't looked to see if it's possible - but I personally think it'd be ideal to be able to say something like:

    fn test_rotate_line() {
        let line0 = Line::from([(0., 0.), (0., 2.)]);
        let line1 = Line::from([(1., 1.0), (-1., 1.)]);
        approx::assert_relative_eq!(line0.rotate(90.), line1);
    }

Maybe we can implement the approx traits for geo? e.g. https://docs.rs/approx/0.4.0/approx/trait.RelativeEq.html

edited: clarity

martinfrances107 · 2020-12-09T21:14:01Z

Thank you for shaping my thinking, the comments above were really useful -- and save a lots of time for me.

I have the start of solution pushed into a branch in my area

abs_diff_eq

-- so not ready for a PR yet.

But I am far enough alone to know that I need to discuss a major breaking change.

First a minor comment

'relative equal' is for dissimilar type like i32 and f64.

What we need is AbsDiffEq - the approximate equality check for identical types.

For Point the solution looks like

p1.abs_diff_eq(&p2)

here is a fragment of the doc test in my branch

    /// use geo_types::Point;
    /// use approx::AbsDiffEq;
    /// let p = Point::new(1.0, 1.0);
    /// let p_x = Point::new(0.999999, 1.0);
    /// let p_y = Point::new(1.0, 1.00001);
    /// assert!(p.abs_diff_eq(&p_x, 1e-2));
    /// assert!(!p.abs_diff_eq(&p_x, 1e-12));

I need to signal a major breaking change for all the other geometry types Line, LineString, Polygon etc

There is a security issue we need to be aware of .. Any implementation of a trait need to be
A) in the module defining the trait ( in this case the abs_diff_eq trait is in approx )
or
B) in the module holding the struct ... ( in this case Point, Line, LineString are in geo-types. )

for security in cannot be defined in a third module C)

as you can see from my version of Line - below what this mean is that I need a breaking change that is to move the coords_iter() trait from the geo module into the geo-types module - to meet that security concern and so I do a zip which will be used to compare an object point by poiint

impl<T> AbsDiffEq for Line<T>
where
    T: CoordinateType + Float,
{
    type Epsilon = T;

    #[inline]
    fn default_epsilon() -> Self::Epsilon {
        T::from(1e-6).unwrap()
    }

    #[inline]
    fn abs_diff_eq(&self, other: &Line<T>, epsilon: Self::Epsilon) -> bool {
        let z = (*self).coords_iter().zip(*other.coords_iter());
        let mut is_match = true;
        for a_b in z {
            let (a, b) = a_b;
            if (a.x - b.x).abs() >= epsilon || (a.y - b.y).abs() >= epsilon {
                println!("Fail {:?} {:?}", a, b);
                is_match = false;
                break;
            }
        }
        is_match
    }
}

michaelkirk · 2020-12-09T21:52:11Z

'relative equal' is for dissimilar type like i32 and f64.

My understanding is that "relative equality" aims to solve the repeated problem of "Which epsilon should I choose?".

diff < T::EPSILON makes sense for some numbers, but for very large and for very small numbers it's not a reasonable choice.

I really enjoyed this https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ which I think uses the same terminology.

Here's the relevant excerpt:

Epsilon comparisons

If comparing floats for equality is a bad idea then how about checking whether their difference is within some error bounds or epsilon value, like this:

bool isEqual = fabs(f1 – f2) <= epsilon;

With this calculation we can express the concept of two floats being close enough that we want to consider them to be equal. But what value should we use for epsilon?

Given our experimentation above we might be tempted to use the error in our sum, which was about 1.19e-7f. In fact, there’s even a define in float.h with that exact value, and it’s called FLT_EPSILON.

Clearly that’s it. The header file gods have spoken and FLT_EPSILON is the one true epsilon!

Except that that is rubbish. For numbers between 1.0 and 2.0 FLT_EPSILON represents the difference between adjacent floats. For numbers smaller than 1.0 an epsilon of FLT_EPSILON quickly becomes too large, and with small enough numbers FLT_EPSILON may be bigger than the numbers you are comparing (a variant of this led to a flaky Chromium test)!

For numbers larger than 2.0 the gap between floats grows larger and if you compare floats using FLT_EPSILON then you are just doing a more-expensive and less-obvious equality check. That is, if two floats greater than 2.0 are not the same then their difference is guaranteed to be greater than FLT_EPSILON. For numbers above 16777216 the appropriate epsilon to use for floats is actually greater than one, and a comparison using FLT_EPSILON just makes you look foolish. We don’t want that.

...

Relative epsilon comparisons

The idea of a relative epsilon comparison is to find the difference between the two numbers, and see how big it is compared to their magnitudes. In order to get consistent results you should always compare the difference to the larger of the two numbers. In English:

To compare f1 and f2 calculate diff = fabs(f1-f2). If diff is smaller than n% of max(abs(f1),abs(f2)) then f1 and f2 can be considered equal.

so tldr; the epsilon is chosen relative to the size of the values you're comparing.

I need to signal a major breaking change for all the other geometry types Line, LineString, Polygon etc
... to move the coords_iter() trait from the geo module into the geo-types module

Hmm... that might be possible. But how heinous would it be to instead implement the equality traits without using coords-iter?

Also, I wonder if it's worth putting the approx-equality impls behind a feature flag since we try to keep geo-types super svelte by default.

martinfrances107 · 2020-12-10T09:33:57Z

Hmm... that might be possible. But how heinous would it be to instead implement the equality traits without using coords-iter?

I have found a less "bull in a china shop approach" .. which will eventually look like succinct library grade code..

This is the point where I shut up and implement it.

martinfrances107 · 2020-12-11T16:17:04Z

Thanks for correcting me and providing a common set of definitions.. That really help michaelkirk++

I am about to upload and submit a PR.

This is more inline with library grade code. In short I started over.

To balance effort with usefulness I have only implemented 'relative_eq' for a few of the 10 geometry types.
Point, MultiPoint, Line, LineString. As more are need I am happy to add more.

In terms of review .. I mostly copied the algorithm for Points::relative_eq() from approx's default reference implementation. I used the euclidean distance .. but I considered using the simpler Manhattan distance.

An overview of the PR

The reference implementation of relative_eq() makes use of the method abs_diff_eq() that is why it is has to be included.
It handles points with coordinates of f64:INFINITY.
When comparing data structure on a point-by-point basis tests are included to ensure the rejection data structures with dissimilar lengths. Given the way zipping works .. they would otherwise be accepted!

martinfrances107 · 2020-12-11T16:19:05Z

Ok here is the PR

#567

michaelkirk · 2021-01-08T16:13:24Z

Done in #567, using approx and relative_eq

martinfrances107 changed the title ~~Tests are brittle. use the concept of in_delta~~ Tests are brittle. Use the concept of in_delta Dec 8, 2020

martinfrances107 mentioned this issue Dec 8, 2020

#412 MultiPolygon::rotate should use centroid as pivot #562

Closed

michaelkirk closed this as completed Jan 8, 2021

This was referenced Oct 13, 2021

Clippy: Remove .clone() where possible. #665

Merged

Incorrectly comparing floats causes random test failings #666

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests are brittle. Use the concept of in_delta #565

Tests are brittle. Use the concept of in_delta #565

martinfrances107 commented Dec 8, 2020 •

edited

Loading

michaelkirk commented Dec 8, 2020 •

edited

Loading

martinfrances107 commented Dec 9, 2020

michaelkirk commented Dec 9, 2020 •

edited

Loading

Epsilon comparisons

Relative epsilon comparisons

martinfrances107 commented Dec 10, 2020

martinfrances107 commented Dec 11, 2020

martinfrances107 commented Dec 11, 2020

michaelkirk commented Jan 8, 2021

Tests are brittle. Use the concept of in_delta #565

Tests are brittle. Use the concept of in_delta #565

Comments

martinfrances107 commented Dec 8, 2020 • edited Loading

michaelkirk commented Dec 8, 2020 • edited Loading

martinfrances107 commented Dec 9, 2020

michaelkirk commented Dec 9, 2020 • edited Loading

Epsilon comparisons

Relative epsilon comparisons

martinfrances107 commented Dec 10, 2020

martinfrances107 commented Dec 11, 2020

martinfrances107 commented Dec 11, 2020

michaelkirk commented Jan 8, 2021

martinfrances107 commented Dec 8, 2020 •

edited

Loading

michaelkirk commented Dec 8, 2020 •

edited

Loading

michaelkirk commented Dec 9, 2020 •

edited

Loading