Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fp.rem is producing incorrect result #2381

Closed
LeventErkok opened this issue Jul 4, 2019 · 27 comments · Fixed by #5550
Closed

fp.rem is producing incorrect result #2381

LeventErkok opened this issue Jul 4, 2019 · 27 comments · Fixed by #5550
Assignees

Comments

@LeventErkok
Copy link

@wintersteiger

This is most likely downfall from #2369

For this benchmark:

(set-option :produce-models true)
(set-logic QF_FP)
(define-fun s2 () (_ FloatingPoint  8 24) ((_ to_fp 8 24) roundNearestTiesToEven (/ 244681.0 4194304.0)))
(define-fun s4 () (_ FloatingPoint  8 24) ((_ to_fp 8 24) roundNearestTiesToEven (/ 1725289.0 16777216.0)))
(declare-fun s0 () (_ FloatingPoint  8 24))
(declare-fun s1 () (_ FloatingPoint  8 24))
(define-fun s3 () Bool (fp.eq s0 s2))
(define-fun s5 () (_ FloatingPoint  8 24) (fp.rem s0 s4))
(define-fun s6 () Bool (fp.eq s1 s5))
(define-fun s7 () Bool (and s3 s6))
(assert s7)
(check-sat)
(get-value (s1))

Z3 (compiled about 2 weeks ago) used to produce:

$ z3_Jun14_2019 bug.smt2
sat
((s1 (fp #b1 #x7a #b01101100100010001010000)))

Now it says:

$ z3 bug.smt
sat
((s1 (fp #b0 #x7a #b11011101111001001000000)))

The benchmark is asking for the result of:

    fp.rem 5.8336496e-2 0.10283524

where the arguments are single-precision floats. Z3 was previously computing the correct answer:

      -4.449874e-2

But now it's simply returning the first argument.

@wintersteiger wintersteiger self-assigned this Jul 4, 2019
@LeventErkok
Copy link
Author

@wintersteiger

Hi Christoph,

Hate to bug you about this; any chance this might be addressed soon?

@wintersteiger
Copy link
Contributor

Unlikely; I'm super busy with unrelated projects and I'm not 100% sure this is a bug yet. Is there an actual application that depends on this or are there just randomly generated testcases you'd like to fix?

@LeventErkok
Copy link
Author

LeventErkok commented Aug 12, 2019

Hi Christoph. It's part of an extended test-suite; hard to tie it to any particular application, but rather making sure basic floating-point operations work correctly.

Regarding whether it's a real bug; I'm pretty sure it is. In fact, z3 itself acknowledges that the generated model is invalid:

$ z3 model_validate=true a.smt2
sat
(error "line 12 column 10: an invalid model was generated")
((s1 (fp #b0 #x7a #b11011101111001001000000)))

Perhaps the issue is somewhere else other than fp.rem. I was trying to simplify it. Consider this version:

(set-option :produce-models true)
(set-logic QF_FP)
(define-fun x () (_ FloatingPoint  8 24) ((_ to_fp 8 24) roundNearestTiesToEven (/ 244681.0 4194304.0)))
(define-fun y () (_ FloatingPoint  8 24) ((_ to_fp 8 24) roundNearestTiesToEven (/ 1725289.0 16777216.0)))
(declare-fun k () (_ FloatingPoint  8 24))
(declare-fun z () (_ FloatingPoint  8 24))
(assert (fp.eq k x))
(assert (fp.eq z (fp.rem k y)))
(check-sat)

This one also produces an invalid model. But if you directly substitute x for k in the last assert:

(set-option :produce-models true)
(set-logic QF_FP)
(define-fun x () (_ FloatingPoint  8 24) ((_ to_fp 8 24) roundNearestTiesToEven (/ 244681.0 4194304.0)))
(define-fun y () (_ FloatingPoint  8 24) ((_ to_fp 8 24) roundNearestTiesToEven (/ 1725289.0 16777216.0)))
(declare-fun k () (_ FloatingPoint  8 24))
(declare-fun z () (_ FloatingPoint  8 24))
(assert (fp.eq k x))
; SUBSTITUTE x for k directly:
(assert (fp.eq z (fp.rem x y)))
(check-sat)

it then produces a valid model. So, perhaps the issue is lying elsewhere other than fp.rem that might end up causing issues elsewhere too.

@wintersteiger
Copy link
Contributor

The simplifier (concrete) and the bit-blaster (symbolic) have very different implementations. If it's a bug, it's definitely in fp.rem and likely in the bit-blaster. During model validation, the simplifier is implicitly compared to the model from the bit-blaster, that's how Z3 determines that it's an invalid model. I have yet to find a meaningful application of this operator though... so motivation is low.

That said, I don't get any model validation errors, neither on the original benchmark nor on the other ones, with model_validate=true, so something's wrong.

@LeventErkok
Copy link
Author

I'm running with z3 compiled from github sources on August 10; which does produce the model-validation errors. (I also have a higher-level model-validator which agrees with it.) I'm curious if your version of z3 is not latest from the master, perchance?

@wintersteiger
Copy link
Contributor

It was the latest from my-fork master, but not from public-fork master.

@wintersteiger
Copy link
Contributor

@LeventErkok I see very strange behavior on your initial test case when I print all the variables:

sat
((s0 (fp #b0 #x7a #b11011101111001001000000))
(s1 (fp #b0 #x7a #b11011101111001001000000))
(s2 (fp #b0 #x7a #b11011101111001001000000))
(s4 (fp #b0 #x7b #b10100101001101101001000))
(s5 (fp #b1 #x7a #b01101100100010001010000))
(s6 false)
(s7 false))

It looks like s5 has the correct value, but s6 and s7 are not satisfied. Do you see this behavior too or is something wrong with my setup?

@LeventErkok
Copy link
Author

@wintersteiger Yes; I see exactly the same. In particular s7 being false is a big red flag!

@wintersteiger
Copy link
Contributor

Yeah, something's seriously wrong here!

@wintersteiger
Copy link
Contributor

Fix pushed, thanks for the report!

@LeventErkok
Copy link
Author

@wintersteiger

Hi Christoph. Thanks for looking into this! I'm afraid the fix didn't quite handle the problem. Here's another case where it is still failing:

(set-option :produce-models true)
(set-logic QF_FP)
(define-fun s2 () (_ FloatingPoint  8 24) ((_ to_fp 8 24) roundNearestTiesToEven (/ 1.0 2.0)))
(define-fun s4 () (_ FloatingPoint  8 24) ((_ to_fp 8 24) roundNearestTiesToEven (/ 11459215.0 16777216.0)))
(declare-fun s0 () (_ FloatingPoint  8 24))
(declare-fun s1 () (_ FloatingPoint  8 24))
(define-fun s3 () Bool (fp.eq s0 s2))
(define-fun s5 () (_ FloatingPoint  8 24) (fp.rem s0 s4))
(define-fun s6 () Bool (fp.eq s1 s5))
(define-fun s7 () Bool (and s3 s6))
(assert s7)
(check-sat)
(get-value (s7))

I get:

$ z3 model_validate=true bug.smt2
sat
(error "line 12 column 10: an invalid model was generated")
((s7 false))

The benchmark is pretty much the same as before; just with different values. Value of s7 (the final assert) is still false, indicating this must've hit another corner case, perhaps?

@wintersteiger
Copy link
Contributor

Doesn't look like a corner-case to me as s5 has the right value I think, but for some reason (fp.eq s1 s5) is not enforced. I don't get any assertion violations or similar in debug mode, so this might take a while to fix.

@LeventErkok
Copy link
Author

Yes; I can confirm that s5 has the correct value:

$ crackNum --sp 1 0111 1100 01110110110101000111100
                  3  2          1         0
                  1 09876543 21098765432109876543210
                  S ---E8--- ----------F23----------
          Binary: 1 01111100 01110110110101000111100
             Hex: BE3B 6A3C
       Precision: SP
            Sign: Negative
        Exponent: -3 (Stored: 124, Bias: 127)
       Hex-float: -0x1.76d478p-3
           Value: -0.18302244 (NORMAL)

This is indeed the correct result that s1 should have at the end.

@wintersteiger
Copy link
Contributor

No it isn't - it's masked by all the define-funs. None of these variables is actually created (only s0, s1 are). The rest are just macros that are inlined eagerly, i.e., during model evaluation, (get-value (s5)) re-evaluates the macro name s5, meaning that fp.rem is now re-evaluated using the simplifier instead of the bit-blaster, so it looks like it has the right value even though it doesn't. The actual problem that is solved is this:

(goal
  (fp.eq s0 ((_ to_fp 8 24) roundNearestTiesToEven (/ 1.0 2.0)))
  (fp.eq s1
         (fp.rem s0
                 ((_ to_fp 8 24)
                   roundNearestTiesToEven
                   (/ 11459215.0 16777216.0)))))

@LeventErkok
Copy link
Author

Ah, I see. This must make debugging simplification vs. actual solving hard to debug. I appreciate you taking the time and looking into this.

wintersteiger pushed a commit to wintersteiger/z3 that referenced this issue Aug 19, 2019
@wintersteiger
Copy link
Contributor

Another corner-case. If you have any more regressions, can you please send all of them to me in one go? Thanks!

@LeventErkok
Copy link
Author

LeventErkok commented Aug 19, 2019

@wintersteiger

Here's another one that's still failing in exactly the same way.

(set-option :produce-models true)
(set-logic QF_FP)
(define-fun s2 () (_ FloatingPoint  8 24) ((_ to_fp 8 24) roundNearestTiesToEven (/ 1.0 2.0)))
(define-fun s4 () (_ FloatingPoint  8 24) ((_ to_fp 8 24) roundNearestTiesToEven (- (/ 11459215.0 16777216.0))))
(declare-fun s0 () (_ FloatingPoint  8 24))
(declare-fun s1 () (_ FloatingPoint  8 24))
(define-fun s3 () Bool (fp.eq s1 s2))
(define-fun s5 () (_ FloatingPoint  8 24) (fp.rem s1 s4))
(define-fun s6 () Bool (fp.eq s0 s5))
(define-fun s7 () Bool (and s3 s6))
(assert s7)
(check-sat)

The difference from the previous example here is that s4 is negated.

I'll try to collect these all and send in one batch later today.

@LeventErkok
Copy link
Author

@wintersteiger

It also appears the handling of NaN isn't quite right with fp.rem:

(set-option :produce-models true)
(set-logic QF_FP)
(define-fun s2 () (_ FloatingPoint  8 24) (_ NaN 8 24))
(declare-fun s0 () (_ FloatingPoint  8 24))
(declare-fun s1 () (_ FloatingPoint  8 24))
(define-fun s3 () Bool (fp.eq s1 s2))
(define-fun s4 () (_ FloatingPoint  8 24) (fp.rem s1 s2))
(define-fun s5 () Bool (fp.eq s0 s4))
(define-fun s6 () Bool (and s3 s5))
(assert s6)
(check-sat)

z3 says unsat for this query. But it should be sat with NaN as the model values. Hopefully this one should be easy to fix!

@wintersteiger
Copy link
Contributor

wintersteiger commented Aug 19, 2019

Isn't (fp.eq s1 s2) unsat when s2 is a NaN?

@LeventErkok
Copy link
Author

Ah you're right. It always trips me up.

But the negative-value example is still problematic, I think. I'm trying to collect all test cases into one bundle, but it is proving to be surprisingly tricky to do so.

@wintersteiger
Copy link
Contributor

Yeah, the negation is likely the core of the problem. fp.rem needs a "bonus" subtraction of y in some cases and I removed the converse "bonus addition" step because I thought it's unnecessary, but maybe it isn't.

@LeventErkok
Copy link
Author

@wintersteiger

Attached is my entire test-suite for fpRem. It has 458 files. Every single one of them must result in unsat for correct execution.

Note that some of these benchmarks run really slow; so you probably want to run them with some sort of a time-out to make sure they don't hog your machine. z3 successfully produces unsat for a bunch of them, but finds bogus models for others.

Let me know if you'd like me to prepare another bundle that only produces sat, say after a certain time-out.

fpRemBenchmarks.zip

@LeventErkok
Copy link
Author

@wintersteiger Should this ticket be re-opened? I understand a fix may not be immediate (or anytime soon!), but at least an open-ticket can help keep track.

@wintersteiger
Copy link
Contributor

Oh yes, absolutely, thanks for reminding me!

@wintersteiger wintersteiger reopened this Aug 30, 2019
@zhendongsu
Copy link

zhendongsu commented May 2, 2020

Here are two additional different instances that may be caused by the same underlying issue. As suggested by Nikolaj, appending them to this thread, rather than opening new issues (and hope they are of use):

[506] % z3debug model_validate=true bug1.smt2 
sat
[507] % z3release model_validate=true bug1.smt2 
sat
(error "line 7 column 10: an invalid model was generated")
[508] % 
[508] % cat bug1.smt2 
(declare-fun a () Float16)
(declare-fun b () Float16)
(declare-fun c () Float16)
(declare-fun d () Float16)
(assert (distinct a (fp #b0 #b00000 #b0000001110)))
(assert (= c (fp.rem a b) d))
(check-sat)
[509] % 
[509] % z3debug model_validate=true bug2.smt2 
sat
(error "line 5 column 20: an invalid model was generated")
[510] % z3release model_validate=true bug2.smt2 
sat
(error "line 5 column 20: an invalid model was generated")
[511] % 
[511] % cat bug2.smt2 
(declare-fun a () Float16)
(declare-fun b () Float16)
(declare-fun c () Float16)
(assert (= c (fp.rem a b) (fp.rem (fp #b1 #b00000 #b1110111111) (fp #b0 #b00000 #b0000001110))))
(check-sat-using smt)
[512] % 

z3debug: debug build
z3release: release build
Tested on Ubuntu 18.04 with the last commit (2695221)

@wintersteiger
Copy link
Contributor

Progress has been made!

@LeventErkok I think yours are all fine now. I didn't wait for it to finish on about 20 of your set, but all the others are unsat. The problem was indeed with that bonus subtraction that I suspected initially.

@zhendongsu Your last one has a completely different root cause, likely not related to the bit-blasting of fp.rem. This will take a separate debugging session.

@LeventErkok
Copy link
Author

@wintersteiger Yes! I can confirm my test-suite for fpRem is now clean. Thanks Christoph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants