Skip to content

Commit

Permalink
ICU-21710 Remove BOYER_MOORE dead code from usearch.cpp
Browse files Browse the repository at this point in the history
  • Loading branch information
jefgen committed Aug 18, 2021
1 parent b03b8be commit 6f61c31
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 2,440 deletions.
4 changes: 3 additions & 1 deletion docs/userguide/icu4j/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,9 @@ determine whether case and accents are ignored during a search.

#### What algorithm are you using to perform the search?

StringSearch uses a version of the Boyer-Moore search algorithm that has been
As of ICU 53, StringSearch uses a simple linear search algorithm which
locates a match by shifting a cursor in the target text one by one. Previous
versions of ICU used a version of the Boyer-Moore search algorithm which was
modified for use with Unicode. Rather than using raw Unicode character values in
its comparisons and shift tables, the algorithm uses collation elements that
have been "hashed" down to a smaller range to make the tables a reasonable size.
Expand Down
11 changes: 6 additions & 5 deletions icu4c/source/i18n/unicode/usearch.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,9 @@
* See the <a href="http://source.icu-project.org/repos/icu/icuhtml/trunk/design/collation/ICU_collation_design.htm">
* "ICU Collation Design Document"</a> for more information.
* <p>
* The implementation may use a linear search or a modified form of the Boyer-Moore
* search; for more information on the latter see
* As of ICU 4.0, the implementation uses a linear search. In previous versions,
* a modified form of the Boyer-Moore searching algorithm was used. For more information
* on the modified Boyer-Moore algorithm see
* <a href="http://icu-project.org/docs/papers/efficient_text_searching_in_java.html">
* "Efficient Text Searching in Java"</a>, published in <i>Java Report</i>
* in February, 1999.
Expand Down Expand Up @@ -595,8 +596,8 @@ U_CAPI UCollator * U_EXPORT2 usearch_getCollator(
/**
* Sets the collator used for the language rules. User retains the ownership
* of this collator, thus the responsibility of deletion lies with the user.
* This method causes internal data such as Boyer-Moore shift tables to
* be recalculated, but the iterator's position is unchanged.
* This method causes internal data such as the pattern collation elements
* and shift tables to be recalculated, but the iterator's position is unchanged.
* @param strsrch search iterator data struct
* @param collator to be used
* @param status for errors if it occurs
Expand All @@ -608,7 +609,7 @@ U_CAPI void U_EXPORT2 usearch_setCollator( UStringSearch *strsrch,

/**
* Sets the pattern used for matching.
* Internal data like the Boyer Moore table will be recalculated, but the
* Internal data like the pattern collation elements will be recalculated, but the
* iterator's position is unchanged.
*
* The UStringSearch retains a pointer to the pattern string. The caller must not
Expand Down
Loading

0 comments on commit 6f61c31

Please sign in to comment.