-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential bug in IndexedDISI90#SPARSE->advanceExactWithinBlock #12465
Conversation
Thanks @gautamworah96 for reporting this ! I was adding the assert to guarantee the consistency between
BTW, |
Hmm. To me it feels like we are changing the current behavior of the (admittedly undefined) API for very little benefit. I agree that we should not have For example, what if we assigned Changing it this way would fix the behavior (of the inconsistency between disi.advance returning false and disi.exists being true) and would also not break the behavior of the API? What is happening with this assert is that clients who relied on earlier calls that requested backwards docids were earlier getting false as the answer and were working on that assumption. But with this assert, tests fail in a somewhat obstruse way..
IMO, it would be better to try to maintain the behavior of the API unless we see a strong compelling benefit (like much better code structure or user experience benefit) in changing it?
I definitely want to add some javadoc here. It took me quite some time to understand what the API did.. :) |
Sorry but I do not think we should do this change indeed :
|
True. Let me think about this a little bit more. I've changed the PR to be a draft for now. |
Okay. I don't think there is much to do here. The check we have introduced here is indeed helpful.. even though the check right now is just at I will open a separate PR for the documentation fixes I had in mind. Thanks @gf2121 ! |
Just to add some more context here, we were in fact doing something wrong with our IndexedDISI iterators. What ended up happening was that, in most cases, all fields would always advance to the same doc, but in certain rare cases, we ended up in situations where one field (that shared the iterator) was trying to start from scratch, ie, 0 when other fields had already advanced the iterators to a forward position. Up until Lucene 9.7, we never hit an error because the code was lenient and always returned false if the iterator was trying to access a position that was backwards. However, with PR #12324 this logic was changed and that broke some of our tests. We have now found the bug and have fixed it! Yay! |
It's awesome that Lucene changed to be more picky (removing leniency) on catching invalid usage of the API! Such improvements are great at ferreting out bugs (instead of masking them) in the application layer (Amazon Search in this case) using Lucene! |
While upgrading our Lucene to 9.7 from 9.6 we noticed that a few tests were failing.
On digging deeper into the tests we realized that they were failing on a particular assert. This one:
This changes the behavior of this function as compared to the behavior in Lucene 9.6 (and before). It could be possible that we intentionally decided to change the behavior of this function, but to my eye, it feels like that was not the case?
I attached a test that shows the assert failing. The test tries to call
advanceExact
which underneath it callsadvanceExactWithinBlock
on a document that is behind the current doc position. In the previous version of Lucene this would return afalse
and would exit. But with the upgrade, it fails mysteriously on an assert.I created this PR because IIUC the assert checks for something that is completely unrelated to the current method call? Why would we want to assert on the existence of a previous document when calling for
advanceExactWithinBlock
on a new document.. My preference here would be to remove just the assert statement and to revert to the previous behavior (of returning false when calling for a document that is behind the current cursor position).I could also be wrong though. Would appreciate a second look from some folks in the community.. Thanks!
Related PR that added the assert: #12324