Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix infinite loop in replace with AI collations #2867

Conversation

tanscorpio7
Copy link
Contributor

Description

Cherry Picked From : #2849

ICU usearch_next() goes into infinite loop when pattern to search starts with a surrogate pair.
To get around this we check if output of usearch_next() is stuck and not proceeding forwards
and set the offset for next search ourselves.
The next offset is simply the next character after the current char in source string.

SRC STRING - 'abc🙂defghi🙂🙂'    PATTERN TO FIND = '🙂def'

usearch_next() gets stuck on "🙂" idx = 3 and repeatedly returns this index.
We will intervene and set the offset to "d" idx = 4. 
So that usearch_next only starts looking from this character.

Issues Resolved

[BABEL-5169]

Sign Off

Signed-off-by: Tanzeel Khan [email protected]

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…esql#2849)

ICU usearch_next() goes into infinite loop when pattern to search starts with a surrogate pair.
To get around this we check if output of usearch_next() is stuck and not proceeding forwards
and set the offset for next search ourselves.
The next offset is simply the next character after the current char in source string.

SRC STRING - 'abc🙂defghi🙂🙂'    PATTERN TO FIND = '🙂def'

usearch_next() gets stuck on "🙂" idx = 3 and repeatedly returns this index.
We will intervene and set the offset to "d" idx = 4. 
So that usearch_next only starts looking from this character.

Taks: BABEL-5167
Signed-off-by: Tanzeel Khan <[email protected]>
@coveralls
Copy link
Collaborator

Pull Request Test Coverage Report for Build 10470060276

Details

  • 20 of 22 (90.91%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.005%) to 73.684%

Changes Missing Coverage Covered Lines Changed/Added Lines %
contrib/babelfishpg_tsql/src/collation.c 20 22 90.91%
Totals Coverage Status
Change from base Build 10383172619: 0.005%
Covered Lines: 44198
Relevant Lines: 59983

💛 - Coveralls

Copy link
Contributor

@jsudrik jsudrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cherry pick for the same approved changes on the other branch. Approved.

@jsudrik jsudrik merged commit 8607687 into babelfish-for-postgresql:BABEL_4_3_STABLE Aug 20, 2024
43 checks passed
@tanscorpio7 tanscorpio7 deleted the BABEL_5169_STABLE_BRANCH branch October 11, 2024 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants