Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-9048 fix 5.7: Fixed problem with percent character in n-grams #5235

Conversation

percona-ysorokin
Copy link
Collaborator

https://perconadev.atlassian.net/browse/PS-9048

Fixed problem with 'fts_index_fetch_nodes()' function not being able to properly handle situations when 'word' parameter contained special characters used by 'LIKE' clauses ('%' and '_'). Introduced additional boolean parameter 'exact_match' that instructs this function to use either 'WHERE word = :word' or 'WHERE word LIKE :word' clauses when selecting records from internal FTS tables. We call 'fts_index_fetch_nodes()' with 'exact_match' set to 'true' only from 'fts_optimize_words()' (when we perform 'OPTIMIZE TABLE' under 'innodb_optimize_fulltext_only' enabled). In every other place

  • fts_query_difference()
  • fts_query_intersect()
  • fts_query_union()
  • fts_query_phrase_search() where we indeed need pattern matching we call this function with 'exact_match' set to 'false' (instructing the function to use the 'LIKE' clause).

Added six new MTR test cases:

  • 'innodb_fts.percona_ft_special_chars_default_ewc_on'
  • 'innodb_fts.percona_ft_special_chars_default_ewc_off'
  • 'innodb_fts.percona_ft_special_chars_ngram_ewc_on'
  • 'innodb_fts.percona_ft_special_chars_ngram_ewc_off'
  • 'innodb_fts.percona_ft_special_chars_mecab_ewc_on'
  • 'innodb_fts.percona_ft_special_chars_mecab_ewc_off' which reproduce original crash scenario using default / ngram / mecab parsers, under both 'ft_query_extra_word_chars' set to 'ON' and 'OFF'. These tests cases also check parsing / querying strings containing various special characters. Please note that they have "result mismatch" status strings recorded in the '.result' files for certain combinations of special characters and this is expected behavior.

Introduced two new MTR helper '.inc' files:

  • percona_install_mecab_plugin.inc
  • percona_uninstall_mecab_plugin.inc which help install / uninstall the mecab plugin along with creating / removing its settings file.
    'innodb_fts.percona_mecab_null_character' MTR test case reworked with these two include files.

https://perconadev.atlassian.net/browse/PS-9048

Fixed problem with 'fts_index_fetch_nodes()' function not being able to
properly handle situations when 'word' parameter contained special characters
used by 'LIKE' clauses ('%' and '_'). Introduced additional boolean parameter
'exact_match' that instructs this function to use either 'WHERE word = :word'
or 'WHERE word LIKE :word' clauses when selecting records from internal FTS
tables. We call 'fts_index_fetch_nodes()' with 'exact_match' set to 'true'
only from 'fts_optimize_words()' (when we perform 'OPTIMIZE TABLE' under
'innodb_optimize_fulltext_only' enabled). In every other place
* fts_query_difference()
* fts_query_intersect()
* fts_query_union()
* fts_query_phrase_search()
where we indeed need pattern matching we call this function with 'exact_match'
set to 'false' (instructing the function to use the 'LIKE' clause).

Added six new MTR test cases:
* 'innodb_fts.percona_ft_special_chars_default_ewc_on'
* 'innodb_fts.percona_ft_special_chars_default_ewc_off'
* 'innodb_fts.percona_ft_special_chars_ngram_ewc_on'
* 'innodb_fts.percona_ft_special_chars_ngram_ewc_off'
* 'innodb_fts.percona_ft_special_chars_mecab_ewc_on'
* 'innodb_fts.percona_ft_special_chars_mecab_ewc_off'
which reproduce original crash scenario using default / ngram / mecab parsers,
under both 'ft_query_extra_word_chars' set to 'ON' and 'OFF'. These tests cases
also check parsing / querying strings containing various special characters.
Please note that they have "result mismatch" status strings recorded in the
'.result' files for certain combinations of special characters and this is
expected behavior.

Introduced two new MTR helper '.inc' files:
* percona_install_mecab_plugin.inc
* percona_uninstall_mecab_plugin.inc
which help install / uninstall the mecab plugin along with creating / removing
its settings file.
'innodb_fts.percona_mecab_null_character' MTR test case reworked with these two
include files.
@percona-ysorokin percona-ysorokin merged commit 6f16603 into percona:release-5.7.44-49 Feb 16, 2024
9 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants