Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve efficiency of BoundedBreakIteratorScanner fragmentation algorithm (the 2nd) #75306

Closed

Conversation

thelink2012
Copy link
Contributor

See #73569 for the issue this tries to solve. See PR #73785 for an explanation of the algorithm.

The previous PR had to be reverted due to it causing tests to fail in Kibana (see elastic/kibana#104466).

One of such failures was caused by targetEndOffset being a int but having a possibility of storing something bigger than that (more precisely for the Kibana tests, if (targetEndOffset + 1 > textEndIndex) overflowed). Commit e2635198d023fa0ef4b4c7132909d4da19e45574 in this PR fixes that.

There were still some failing tests after that. Needs investigation.

@elasticsearchmachine elasticsearchmachine added v8.0.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jul 13, 2021
@gwbrown gwbrown added the :Search Relevance/Highlighting How a query matched a document label Jul 13, 2021
@jimczi
Copy link
Contributor

jimczi commented Jul 26, 2021

@elasticmachine ok to test

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the overflow and adding the test @thelink2012.
We should be more strict with the range of passage size that we accept but that's outside of the scope of this PR. I'll merge soon.

@thelink2012
Copy link
Contributor Author

thelink2012 commented Jul 26, 2021 via email

@jimczi
Copy link
Contributor

jimczi commented Jul 26, 2021

Ok, thanks. Although Kibana shouldn't rely on the passage size to highlight the text entirely. I'll investigate.

@elasticsearchmachine
Copy link
Collaborator

@thelink2012 please enable the option "Allow edits and access to secrets by maintainers" on your PR. For more information, see the documentation.

@thelink2012 thelink2012 force-pushed the denilson/improved-highlight-2 branch from e263519 to cafdece Compare July 11, 2022 23:36
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jul 18, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@thelink2012
Copy link
Contributor Author

thelink2012 commented Jul 18, 2022

As explained here it seems that a mistake happened when testing this patch in Kibana. We were testing the old version (that was reverted), not the fixed one (this PR). I think this PR can be merged. I also ran the Kibana functional tests on master and it passes.

@thelink2012
Copy link
Contributor Author

thelink2012 commented Jul 18, 2022

I see there's this new Check changelog but I haven't found any documentation on how to fill these YML fields. I've seen examples in other PRs, but would like to make sure I have to fill area with Search/Highlighting and type with enhancement.

@elasticsearchmachine elasticsearchmachine changed the base branch from master to main July 22, 2022 23:10
@mark-vieira mark-vieira added v8.5.0 and removed v8.4.0 labels Jul 27, 2022
@thelink2012
Copy link
Contributor Author

Hi @mark-vieira. Hope you are doing well! Since you switched version labels, i wonder whether you could help me move this PR forward? Perhaps the PR needs an assignee (which was @jimczi but I see he no longer works at Elastic). As far as I can see (since @jimczi did previously approve, it was me that pushed the merge back for further testing), this PR is pretty much ready for merging.

@mark-vieira
Copy link
Contributor

Could someone from @elastic/es-search determine if this pull request is still relevant?

@romseygeek romseygeek self-assigned this Aug 2, 2022
@romseygeek
Copy link
Contributor

Hey @thelink2012, thanks for pushing this. I'm trying to get up to speed on what this is hoping to fix, and I notice that the first of the tests included in the PR, testTextThatEndsBeforeMaxLen, actually passes on the current main branch. Could you explain a bit more what you're trying to do here?

@romseygeek
Copy link
Contributor

@elasticmachine generate changelog

@romseygeek
Copy link
Contributor

Ah, never mind, I've read the attached issue and see what we're trying to do here. All looks good, let's get it up to date and merged.

@romseygeek
Copy link
Contributor

@elasticmachine update branch

@romseygeek
Copy link
Contributor

Can you enable the option "Allow edits and access to secrets by maintainers" for me? Our @elasticmachine bot can automate all sorts of stuff for us, but it needs edit access.

For more information, see the documentation.

@thelink2012
Copy link
Contributor Author

thelink2012 commented Aug 2, 2022

Hi @romseygeek. Thanks for looking at this!! Unfortunately Allow edits and access to secrets by maintainers doesn't work in organization repositories. I have opened this PR again at #89041 in my personal account such that the option is enabled. Please check if that one works :-)

References:

@thelink2012 thelink2012 closed this Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :Search Relevance/Highlighting How a query matched a document stalled Team:Search Meta label for search team v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.