Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discover-Next] support dataframes, not just a single dataframe #7032

Open
11 tasks
Tracked by #6957
ashwin-pc opened this issue Jun 14, 2024 · 0 comments · May be fixed by #7086
Open
11 tasks
Tracked by #6957

[Discover-Next] support dataframes, not just a single dataframe #7032

ashwin-pc opened this issue Jun 14, 2024 · 0 comments · May be fixed by #7086

Comments

@ashwin-pc
Copy link
Member

ashwin-pc commented Jun 14, 2024

Problem

The dataframe will be destroyed on source change. This becomes problematic when we would store the session ID into the meta info in the dataframe.

Because if the user switches source from an async query result to another source then back to the same source from async query before. We shouldnt need to do the short polling for 30 seconds again because we should still have the same session ID.

Another problem was that it expected the source to be an index. Part of the virtual index pattern creation would get the mappings for that source. If the underlying source is not an index then that API will fail. So we should address that as well.

⚠️ Blocker for async queries

Expected Result

Keep a cache of dataframes instead of a single dataframe. If the dataframe is null, then create a dataframe with a name and null schema.

Don't destroy the dataframe when switching source.

Requirements

  • Modify to keep dataframe service to dataframes service
  • Create dataframe with name being the source if no dataframe for source exists in cache
  • No errors for the temp index pattern creation (or generalize it to be a schema call that plugins can also replace)
  • Send dataframe to search
  • Populate dataframe don't recreate in interceptors
  • Figure out when the cache is invalidated
    • Be explicit in design
    • Hydration strategy
    • Consider if the schema has change
  • Clean up code where needed
  • Tests

Additional info

@sejli implemented a solution for the temp branch: https://github.com/sejli/OpenSearch-Dashboards/blob/0771fc900e877a79afe73b5b6d593be036d2e0f7/src/plugins/data/common/data_frames/_df_cache.ts#L37

It's great and solves the problem for async queries but since we have time we should take a holistic approach.

@ashwin-pc ashwin-pc assigned ashwin-pc and abbyhu2000 and unassigned ashwin-pc Jun 14, 2024
@kavilla kavilla changed the title Convert dataframe service to a dataframes service [Discover-Next] support dataframes, not just a single dataframe Jun 14, 2024
@abbyhu2000 abbyhu2000 linked a pull request Jun 21, 2024 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants