Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Edits frequent items aggregation #91564

Merged
merged 1 commit into from
Nov 15, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ might be returned if their support values are different.

The runtime of the aggregation depends on the data and the provided parameters.
It might take a significant time for the aggregation to complete. For this
reason, it is recommended to use <<async-search, async search>> to run your
reason, it is recommended to use <<async-search,async search>> to run your
requests asynchronously.


Expand Down Expand Up @@ -73,7 +73,7 @@ aggregation might require a significant amount of system resources.
The minimum set size is the minimum number of items the set needs to contain. A
value of 1 returns the frequency of single items. Only item sets that contain at
least the number of `minimum_set_size` items are returned. For example, the item
set `orange, banana, apple` is only returned if the minimum set size is 3 or
set `orange, banana, apple` is returned only if the minimum set size is 3 or
lower.

[discrete]
Expand Down Expand Up @@ -123,15 +123,15 @@ In the following examples, we use the e-commerce {kib} sample data set.


[discrete]
==== Aggregation with two analized fields
==== Aggregation with two analyzed fields

In the first example, the goal is to find out based on transaction data (1.)
from what product categories the customers purchase products frequently together
and (2.) from which cities they make those purchases. We are interested in sets
with three or more items, and want to see the first three frequent item sets
with the highest support.

Note that we use the <<async-search, async search>> endpoint in this first
Note that we use the <<async-search,async search>> endpoint in this first
example.

[source,console]
Expand Down Expand Up @@ -228,8 +228,8 @@ of documents containing the item set by the total number of documents.
The response shows that the categories customers purchase from most frequently
together are `Women's Clothing` and `Women's Shoes` and customers from New York
tend to buy items from these categories frequently togeher. In other words,
customers who buy products labelled Women's Clothing more likely buy products
also from the Women's Shoes category and customers from New York most likely buy
customers who buy products labelled `Women's Clothing` more likely buy products
also from the `Women's Shoes` category and customers from New York most likely buy
products from these categories together. The item set with the second highest
support is `Women's Clothing` and `Women's Accessories` with customers mostly
from New York. Finally, the item set with the third highest support is
Expand Down Expand Up @@ -269,8 +269,8 @@ POST /kibana_sample_data_ecommerce/_async_search
// TEST[skip:setup kibana sample data]

The result will only show item sets that created from documents matching the
filter, namely purchases in Europe. Using `filter` the calculated `support` still
takes all purchases into acount. That's different to specifying a query at the
filter, namely purchases in Europe. Using `filter`, the calculated `support` still
takes all purchases into acount. That's different than specifying a query at the
top-level, in which case `support` gets calculated only from purchases in Europe.


Expand All @@ -279,7 +279,7 @@ top-level, in which case `support` gets calculated only from purchases in Europe

The frequent items aggregation enables you to bucket numeric values by using
<<runtime,runtime fields>>. The next example demonstrates how to use a script to
add a runtime field to your documents that called `price_range` which is
add a runtime field to your documents called `price_range`, which is
calculated from the taxful total price of the individual transactions. The
runtime field then can be used in the frequent items aggregation as a field to
analyze.
Expand Down