Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Fuzzy Query docs to clarify default behavior re max_expansions #30819

Merged
merged 2 commits into from
Jul 30, 2018
Merged

Update Fuzzy Query docs to clarify default behavior re max_expansions #30819

merged 2 commits into from
Jul 30, 2018

Conversation

wpbonelli
Copy link
Contributor

Stating that the Fuzzy Query generates "all possible" matching terms is misleading, given that the query's default behavior is to generate a maximum of 50 matching terms.

Maybe it's worthwhile to use one of those Warning boxes somewhere towards the top of the page to clarify this? I suspect many users will read "generates all possible matching terms" and immediately begin using the query, expecting it to generate all possible terms by default (having done exactly that myself).

Stating that the Fuzzy Query generates "all possible" matching terms is misleading, given that the query's default behavior is to generate a maximum of 50 matching terms.

Maybe it's worthwhile to use one of those Warning boxes somewhere towards the top of the page to clarify this? I suspect many users will read "generates all possible matching terms" and immediately begin using the query, expecting it to generate all possible terms by default (having done exactly that myself).
@colings86 colings86 added >docs General docs changes :Search/Search Search-related issues that do not fall into other categories labels May 25, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @w-bonelli for the PR, and I’m sorry this review is coming so late.

I agree that this wording is confusing and I think your edit helps. Instead of a warning box, what do you think about also updating the end of the paragraph to mention max_expansions? We could add something like ‘The final query is executed using up to max_expansions matching terms’.

Lastly, would you be able to sign the contributor license agreement? The build check is currently failing.

@wpbonelli
Copy link
Contributor Author

Thanks @jtibshirani for your reply, I've signed the contributor agreement and updated the PR per your suggestion.

Incidentally, I think a single line like this could also work:

The fuzzy query generates up to max_expansions matching terms that are within the maximum edit distance specified in fuzziness, then checks the term dictionary to find out which of those generated terms actually exist in the index.

Either seems sufficient clarification though.

@jtibshirani
Copy link
Contributor

jtibshirani commented Jul 30, 2018

It looks good to me -- I'll merge this and cherry-pick it to the right branches.

I prefer the current wording since it's more precise: we don't first generate max_expansions terms, then check if they are in the index. Rather we consider all matching terms that are in the index, and only keep max_expansions of those.

@jtibshirani jtibshirani merged commit 345a007 into elastic:6.2 Jul 30, 2018
jtibshirani pushed a commit that referenced this pull request Jul 30, 2018
…#30819)

Stating that the Fuzzy Query generates "all possible" matching terms is misleading, given that the query's default behavior is to generate a maximum of 50 matching terms.

(cherry picked from commit 345a007)
jtibshirani pushed a commit that referenced this pull request Jul 30, 2018
…#30819)

Stating that the Fuzzy Query generates "all possible" matching terms is misleading, given that the query's default behavior is to generate a maximum of 50 matching terms.

(cherry picked from commit 345a007)
jtibshirani pushed a commit that referenced this pull request Jul 30, 2018
…#30819)

Stating that the Fuzzy Query generates "all possible" matching terms is misleading, given that the query's default behavior is to generate a maximum of 50 matching terms.

(cherry picked from commit 345a007)
jtibshirani pushed a commit that referenced this pull request Jul 30, 2018
…#30819)

Stating that the Fuzzy Query generates "all possible" matching terms is misleading, given that the query's default behavior is to generate a maximum of 50 matching terms.

(cherry picked from commit 345a007)
dnhatn added a commit that referenced this pull request Jul 31, 2018
* master:
  Logging: Make node name consistent in logger (#31588)
  Mute SSLTrustRestrictionsTests on JDK 11
  Increase max chunk size to 256Mb for repo-azure (#32101)
  Docs: Fix README upgrade mention (#32313)
  Changed ReindexRequest to use Writeable.Reader (#32401)
  Mute KerberosAuthenticationIT
  Fix AutoIntervalDateHistogram.testReduce random failures (#32301)
  fix no=>not typo (#32463)
  Mute QueryProfilerIT#testProfileMatchesRegular()
  HLRC: Add delete watch action (#32337)
  High-level client: fix clusterAlias parsing in SearchHit (#32465)
  Fix calculation of orientation of polygons (#27967)
  [Kerberos] Add missing javadocs (#32469)
  [Kerberos] Remove Kerberos bootstrap checks (#32451)
  Make get all app privs requires "*" permission (#32460)
  Switch security to new style Requests (#32290)
  Switch security spi example to new style Requests (#32341)
  Painless: Add PainlessConstructor (#32447)
  update rollover to leverage write-alias semantics (#32216)
  Update Fuzzy Query docs to clarify default behavior re max_expansions (#30819)
  INGEST: Clean up Java8 Stream Usage (#32059)
  Ensure KeyStoreWrapper decryption exceptions are handled (#32464)
dnhatn added a commit that referenced this pull request Aug 2, 2018
* 6.x:
  Fix scriptdocvalues tests with dates
  Correct minor typo in explain.asciidoc for HLRC
  Fix painless whitelist and warnings from backporting #31441
  Build: Add elastic maven to repos used by BuildPlugin (#32549)
  Scripting: Conditionally use java time api in scripting (#31441)
  [ML] Improve error when no available field exists for rule scope (#32550)
  [ML] Improve error for functions with limited rule condition support (#32548)
  [ML] Remove multiple_bucket_spans
  [ML] Fix thread leak when waiting for job flush (#32196) (#32541)
  Painless: Clean Up PainlessField (#32525)
  Add @AwaitsFix for #32554
  Remove broken @link in Javadoc
  Add AwaitsFix to failing test - see #32546
  SQL: Added support for string manipulating functions with more than one parameter (#32356)
  [DOCS] Reloadable Secure Settings (#31713)
  Fix compilation error introduced by #32339
  [Rollup] Remove builders from TermsGroupConfig (#32507)
  Use hostname instead of IP with SPNEGO test (#32514)
  Switch x-pack rolling restart to new style Requests (#32339)
  [DOCS] Small fixes in rule configuration page (#32516)
  Painless: Clean up PainlessMethod (#32476)
  SQL: Add test for handling of partial results (#32474)
  Docs: Add missing migration doc for logging change
  Build: Remove shadowing from benchmarks (#32475)
  Docs: Add all JDKs to CONTRIBUTING.md
  Logging: Make node name consistent in logger (#31588)
  High-level client: fix clusterAlias parsing in SearchHit (#32465)
  REST high-level client: parse back _ignored meta field (#32362)
  backport fix of reduceRandom fix (#32508)
  Add licensing enforcement for FIPS mode (#32437)
  INGEST: Clean up Java8 Stream Usage (#32059) (#32485)
  Improve the error message when an index is incompatible with field aliases. (#32482)
  Mute testFilterCacheStats
  Scripting: Fix painless compiler loader to know about context classes (#32385)
  [ML][DOCS] Fix typo applied_to => applies_to
  Mute SSLTrustRestrictionsTests on JDK 11
  Changed ReindexRequest to use Writeable.Reader (#32401)
  Increase max chunk size to 256Mb for repo-azure (#32101)
  Mute KerberosAuthenticationIT
  fix no=>not typo (#32463)
  HLRC: Add delete watch action (#32337)
  Fix calculation of orientation of polygons (#27967)
  [Kerberos] Add missing javadocs (#32469)
  Fix missing JavaDoc for @throws in several places in KerberosTicketValidator.
  Make get all app privs requires "*" permission (#32460)
  Ensure KeyStoreWrapper decryption exceptions are handled (#32472)
  update rollover to leverage write-alias semantics (#32216)
  [Kerberos] Remove Kerberos bootstrap checks (#32451)
  Switch security to new style Requests (#32290)
  Switch security spi example to new style Requests (#32341)
  Painless: Add PainlessConstructor (#32447)
  Update Fuzzy Query docs to clarify default behavior re max_expansions (#30819)
  Remove > from Javadoc (fatal with Java 11)
  Tests: Fix convert error tests to use fixed value (#32415)
  IndicesClusterStateService should replace an init. replica with an init. primary with the same aId (#32374)
  auto-interval date histogram - 6.x backport (#32107)
  [CI] Mute DocumentSubsetReaderTests testSearch
  [TEST] Mute failing InternalEngineTests#testSeqNoAndCheckpoints
  TEST: testDocStats should always use forceMerge (#32450)
  TEST: Avoid deletion in FlushIT
  AwaitsFix IndexShardTests#testDocStats
  Painless: Add method type to method. (#32441)
  Remove reference to non-existent store type (#32418)
  [TEST] Mute failing FlushIT test
  Fix ordering of bootstrap checks in docs (#32417)
  Wrong discovery.type for azure in breaking changes (#32432)
  Mute ConvertProcessorTests failing tests
  TESTS: Move netty leak detection to paranoid level (#32354) (#32425)
  Upgrade to Lucene-7.5.0-snapshot-608f0277b0 (#32390)
  [Kerberos] Avoid vagrant update on precommit (#32416)
  TEST: Avoid triggering merges in FlushIT
  [DOCS] Fixes formatting of scope object in job resource
  Switch x-pack/plugin to new style Requests (#32327)
  Release requests in cors handle (#32410)
  Remove BouncyCastle dependency from runtime (#32402)
  Copy missing segment attributes in getSegmentInfo (#32396)
  Rest HL client: Add put license action (#32214)
  Docs: Correcting a typo in tophits (#32359)
  Build: Stop double generating buildSrc pom (#32408)
  Switch x-pack full restart to new style Requests (#32294)
  Painless: Clean Up PainlessClass Variables (#32380)
  [ML] Consistent pattern for strict/lenient parser names (#32399)
  Add Restore Snapshot High Level REST API
  Update update-settings.asciidoc (#31378)
  Introduce index store plugins (#32375)
  Rank-Eval: Reduce scope of an unchecked supression
  Make sure _forcemerge respects `max_num_segments`. (#32291)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>docs General docs changes :Search/Search Search-related issues that do not fall into other categories v6.2.5 v6.3.3 v6.4.0 v6.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants