Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collapsing search results new page added to documentation #7678

147 changes: 147 additions & 0 deletions _search-plugins/searching-data/collapse-search-results.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
---
layout: default
title: Collapse results
parent: Searching data
nav_order: 23
redirect_from:
- /opensearch/search/collapse/
---
# Collapsing search results in OpenSearch

Collapsing search results in OpenSearch allows you to group and return the top documents per group based on a specified field. This is useful for scenarios where you want to avoid returning duplicate documents or need to display only one document per group based on certain criteria.

Collapsing the results can also be helpful when dealing with large datasets where multiple documents share common field values and you want to avoid redundancy in the search results.


Collapsing the search results returned allows you to:
- Reduce redundancy by preventing duplicate or similar documents from cluttering the search results.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a repeat of what has been described before.

- Improve performance by enhancing query performance by reducing the number of documents returned.
- Allows for better insights by focusing the returned results on unique groups of documents.

## Examples of collapsing search results in OpenSearch Dashboards

To begin collapsing the results in OpenSearch Dashboards, follow these steps:

1. Navigate to the OpenSearch Dashboards UI.
2. In the in the sidebar, see the `Managment` section and click on `Dev Tools`. We will be exploring the `opensearch_dashboards_sample_data_flights` index.
3. Write your DSL query in the dev tools left window (example DSL queries provided in subsequent sections).

Check failure on line 27 in _search-plugins/searching-data/collapse-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Dev Tools' instead of 'dev tools'. Raw Output: {"message": "[Vale.Terms] Use 'Dev Tools' instead of 'dev tools'.", "location": {"path": "_search-plugins/searching-data/collapse-search-results.md", "range": {"start": {"line": 27, "column": 32}}}, "severity": "ERROR"}

Check failure on line 27 in _search-plugins/searching-data/collapse-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: dev. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: dev. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/searching-data/collapse-search-results.md", "range": {"start": {"line": 27, "column": 32}}}, "severity": "ERROR"}
4. Highlight the query and click the play button to run the query.
5. The answer is outputted in the right half of the dev tools window.

Check failure on line 29 in _search-plugins/searching-data/collapse-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Dev Tools' instead of 'dev tools'. Raw Output: {"message": "[Vale.Terms] Use 'Dev Tools' instead of 'dev tools'.", "location": {"path": "_search-plugins/searching-data/collapse-search-results.md", "range": {"start": {"line": 29, "column": 53}}}, "severity": "ERROR"}

Check failure on line 29 in _search-plugins/searching-data/collapse-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: dev. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: dev. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/searching-data/collapse-search-results.md", "range": {"start": {"line": 29, "column": 53}}}, "severity": "ERROR"}

### Example: collapsing search results by carrier

Check failure on line 31 in _search-plugins/searching-data/collapse-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingColon] Capitalize the word after a colon in ': collapsing'. Raw Output: {"message": "[OpenSearch.HeadingColon] Capitalize the word after a colon in ': collapsing'.", "location": {"path": "_search-plugins/searching-data/collapse-search-results.md", "range": {"start": {"line": 31, "column": 12}}}, "severity": "ERROR"}

To collapse search results by the `Carrier` field, ensuring that only the top document for each carrier is returned, you can use the following DSL query:

```json
GET opensearch_dashboards_sample_data_flights/_search
{
"query": {
"match_all": {}
},
"collapse": {
"field": "Carrier"
}
}
```

### Example: collapsing with inner hits

Check failure on line 47 in _search-plugins/searching-data/collapse-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingColon] Capitalize the word after a colon in ': collapsing'. Raw Output: {"message": "[OpenSearch.HeadingColon] Capitalize the word after a colon in ': collapsing'.", "location": {"path": "_search-plugins/searching-data/collapse-search-results.md", "range": {"start": {"line": 47, "column": 12}}}, "severity": "ERROR"}

To collapse search results by the `Carrier` field and also include the top 5 documents for each carrier, you can use the following DSL query:

```json
GET opensearch_dashboards_sample_data_flights/_search
{
"query": {
"match_all": {}
},
"collapse": {
"field": "Carrier",
"inner_hits": {
"name": "top_hits",
"size": 5
}
}
}
```

### Example: collapsing and sorting

Check failure on line 67 in _search-plugins/searching-data/collapse-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingColon] Capitalize the word after a colon in ': collapsing'. Raw Output: {"message": "[OpenSearch.HeadingColon] Capitalize the word after a colon in ': collapsing'.", "location": {"path": "_search-plugins/searching-data/collapse-search-results.md", "range": {"start": {"line": 67, "column": 12}}}, "severity": "ERROR"}

To collapse search results by the `Carrier` field, ensuring that only the top document for each carrier is returned based on the highest `AvgTicketPrice`, you can use the following DSL query:

```json
GET opensearch_dashboards_sample_data_flights/_search
{
"query": {
"match_all": {}
},
"collapse": {
"field": "Carrier"
},
"sort": [
{
"AvgTicketPrice": "desc"
}
]
}
```

### Example: collapsing and filtering

Check failure on line 88 in _search-plugins/searching-data/collapse-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingColon] Capitalize the word after a colon in ': collapsing'. Raw Output: {"message": "[OpenSearch.HeadingColon] Capitalize the word after a colon in ': collapsing'.", "location": {"path": "_search-plugins/searching-data/collapse-search-results.md", "range": {"start": {"line": 88, "column": 12}}}, "severity": "ERROR"}

To collapse search results by the `Carrier` field, filter flights with an `AvgTicketPrice` between `100` and `500`, and include the top 3 documents for each carrier, you can use the following DSL query:

```json
GET opensearch_dashboards_sample_data_flights/_search
{
"query": {
"range": {
"AvgTicketPrice": {
"gte": 100,
"lte": 500
}
}
},
"collapse": {
"field": "Carrier",
"inner_hits": {
"name": "top_hits",
"size": 3
}
}
}
```

### Example: collapsing and aggregating

Check failure on line 113 in _search-plugins/searching-data/collapse-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingColon] Capitalize the word after a colon in ': collapsing'. Raw Output: {"message": "[OpenSearch.HeadingColon] Capitalize the word after a colon in ': collapsing'.", "location": {"path": "_search-plugins/searching-data/collapse-search-results.md", "range": {"start": {"line": 113, "column": 12}}}, "severity": "ERROR"}

To collapse search results by the `Carrier` field and aggregate the average ticket price for each carrier, you can use the following DSL query:

```json
GET opensearch_dashboards_sample_data_flights/_search
{
"size": 0,
"query": {
"match_all": {}
},
"collapse": {
"field": "Carrier"
},
"aggs": {
"avg_price_per_carrier": {
"terms": {
"field": "Carrier"
},
"aggs": {
"avg_price": {
"avg": {
"field": "AvgTicketPrice"
}
}
}
}
}
}

```

Collapsing search results in OpenSearch is a powerful feature for managing large datasets by grouping documents based on specific fields. This helps in reducing redundancy, improving performance, and gaining better insights from your search results.

By utilizing the collapsing feature effectively, you can streamline your search results and focus on the most relevant information.
Loading