[Reporting] use point-in-time for paging search results #144201

tsullivan · 2022-10-29T00:28:38Z

Summary

Based on de10eae

Checklist

Unit or functional tests were updated or added to match the most common scenarios

Reviewer guide

See documentation on paging search results:

The search response includes an array of sort values for each hit. ... To retrieve the next page of results, repeat the request, take the sort values from the last hit, and insert those into the search_after array:

The open point in time request and each subsequent search request can return different id; thus always use the most recently received id for the next search request.

The search response includes an array of sort values for each hit. If you used a PIT, a tiebreaker is included as the last sort values for each hit. This tiebreaker called _shard_doc is added automatically on every search requests that use a PIT. The _shard_doc value is the combination of the shard index within the PIT and the Lucene’s internal doc ID, it is unique per document and constant within a PIT.

Release Note

Fixed a bug with CSV export in Discover, where searching over hundreds of shards would result in an incomplete CSV file.

…ce-exporttype

majagrubic · 2022-11-07T07:24:27Z

src/plugins/data/common/search/search_source/types.ts

+  searchAfter?: estypes.SortResults;
+  /**
+   * Allow querying to use a point-in-time ID for paging results
+   * Requires searchAfter when the page index is > 1.


This should be enforced in search source, before setting the pit.

we chatted a bit to clarify this, and agreed it's better to just get rid of this comment.

jloleysens

Great job @tsullivan ! I'm glad to see we can employ the recommended way of deeply paging through data for CSV exports 👏🏻

Tesed locally on a small-ish sample - do you have any data on how this performs in Kibana with larger datasets?

…ce-exporttype

kibana-ci · 2022-11-07T16:41:34Z

💚 Build Succeeded

Metrics [docs]

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`data`	401.3KB	401.3KB	+43.0B

Unknown metric groups

API count

id	before	after	diff
`data`	3261	3263	+2

ESLint disabled in files

id	before	after	diff
`osquery`	1	2	+1

ESLint disabled line counts

id	before	after	diff
`enterpriseSearch`	19	21	+2
`fleet`	58	64	+6
`osquery`	108	113	+5
`securitySolution`	440	446	+6
total			+19

References to deprecated APIs

id	before	after	diff
`discover`	25	30	+5
`reporting`	5	0	-5
total			-0

Total ESLint disabled count

id	before	after	diff
`enterpriseSearch`	20	22	+2
`fleet`	66	72	+6
`osquery`	109	115	+6
`securitySolution`	517	523	+6
total			+20

History

💚 Build #85716 succeeded de821a8
💚 Build #85637 succeeded 1905042
💔 Build #85625 failed c5710b1
💚 Build #85563 succeeded a46385e
💛 Build #85318 was flaky dcccbb5

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

tsullivan force-pushed the reporting/searchsource-exporttype branch 2 times, most recently from 316043b to 31822a2 Compare November 3, 2022 02:51

tsullivan changed the title ~~[Reporrting] use point-in-time for paging search results~~ [Reporting] use point-in-time for paging search results Nov 3, 2022

tsullivan force-pushed the reporting/searchsource-exporttype branch 2 times, most recently from 6317085 to 31608ff Compare November 3, 2022 03:13

elastic deleted a comment from kibana-ci Nov 3, 2022

tsullivan added Team:SharedUX Team label for AppEx-SharedUX (formerly Global Experience) ci:cloud-deploy Create or update a Cloud deployment ci:cloud-redeploy Always create a new Cloud deployment labels Nov 3, 2022

tsullivan force-pushed the reporting/searchsource-exporttype branch 6 times, most recently from 538cfee to d90e0c4 Compare November 3, 2022 20:53

tsullivan added 2 commits November 3, 2022 14:52

[Reporting] use point-in-time for paging search results

adaae66

add new PIT tests to data plugin

dcccbb5

tsullivan force-pushed the reporting/searchsource-exporttype branch from d90e0c4 to dcccbb5 Compare November 3, 2022 21:54

Merge branch 'main' into reporting/searchsource-exporttype

a46385e

tsullivan marked this pull request as ready for review November 4, 2022 17:34

tsullivan requested review from a team as code owners November 4, 2022 17:34

tsullivan added release_note:fix (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead labels Nov 4, 2022

Merge branch 'main' into reporting/searchsource-exporttype

42c15b6

tsullivan marked this pull request as draft November 5, 2022 00:29

tsullivan added 4 commits November 4, 2022 17:30

fix deprecation

fec0f95

update point-in-time ID to the latest one received

6284ac2

add warning for shard failure

73a428e

fix/cleanup csv generation test

6dfecf8

tsullivan added 4 commits November 4, 2022 19:10

add requestTimeout to openPit request

1ada507

logging polishes

c5710b1

Merge remote-tracking branch 'elastic/main' into reporting/searchsour…

740a2c7

…ce-exporttype

fix test

1905042

tsullivan marked this pull request as ready for review November 5, 2022 19:13

tsullivan requested review from jloleysens and vadimkibana November 5, 2022 19:14

majagrubic reviewed Nov 7, 2022

View reviewed changes

Merge branch 'main' into reporting/searchsource-exporttype

de821a8

jloleysens approved these changes Nov 7, 2022

View reviewed changes

tsullivan added 2 commits November 7, 2022 09:02

remove confusing comment

92ce96f

Merge remote-tracking branch 'elastic/main' into reporting/searchsour…

b03f445

…ce-exporttype

tsullivan requested a review from a team as a code owner November 7, 2022 16:02

majagrubic approved these changes Nov 7, 2022

View reviewed changes

tsullivan enabled auto-merge (squash) November 7, 2022 16:56

tsullivan merged commit 455fb1d into elastic:main Nov 7, 2022

kibanamachine added v8.6.0 backport:skip This commit does not require backporting labels Nov 7, 2022

tsullivan deleted the reporting/searchsource-exporttype branch November 7, 2022 20:26

tsullivan mentioned this pull request Nov 21, 2022

[Reporting/CSV Export] issues with high search latency in ES #129524

Closed

Dosant mentioned this pull request Mar 8, 2023

[Reporting] csv report always includes frozen indices #152884

Closed

stefnestor mentioned this pull request Apr 15, 2023

[Security] Permissions via Aliases elastic/elasticsearch#95261

Open

geekpete mentioned this pull request May 24, 2023

[DOCS] Document as breaking change in 8.6 the switch from Scroll to PIT for CSV reports that no longer work against index aliases if the permissions are not granted to underlying indices #158338

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Reporting] use point-in-time for paging search results #144201

[Reporting] use point-in-time for paging search results #144201

tsullivan commented Oct 29, 2022 •

edited

Loading

majagrubic Nov 7, 2022

tsullivan Nov 7, 2022

jloleysens left a comment •

edited

Loading

kibana-ci commented Nov 7, 2022 •

edited

Loading

API count

ESLint disabled in files

ESLint disabled line counts

References to deprecated APIs

Total ESLint disabled count

[Reporting] use point-in-time for paging search results #144201

[Reporting] use point-in-time for paging search results #144201

Conversation

tsullivan commented Oct 29, 2022 • edited Loading

Summary

Checklist

Reviewer guide

Release Note

majagrubic Nov 7, 2022

Choose a reason for hiding this comment

tsullivan Nov 7, 2022

Choose a reason for hiding this comment

jloleysens left a comment • edited Loading

Choose a reason for hiding this comment

kibana-ci commented Nov 7, 2022 • edited Loading

💚 Build Succeeded

Metrics [docs]

Page load bundle

API count

ESLint disabled in files

ESLint disabled line counts

References to deprecated APIs

Total ESLint disabled count

History

tsullivan commented Oct 29, 2022 •

edited

Loading

jloleysens left a comment •

edited

Loading

kibana-ci commented Nov 7, 2022 •

edited

Loading