Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Search] SQL search strategy #127859

Merged
merged 14 commits into from
Mar 24, 2022
Merged

[Search] SQL search strategy #127859

merged 14 commits into from
Mar 24, 2022

Conversation

Dosant
Copy link
Contributor

@Dosant Dosant commented Mar 16, 2022

Summary

close #119280

This pr adds "low level" SQL search strategy that is implemented in a similar way how default ESE_SEARCH_STRATEGY strategy. The assumption is that it will be used in unified search efforts.

It is aligned with default ESE_SEARCH_STRATEGY:

  • API (params) structure are similar
  • supports abortSignal and executionContext
  • It doesn't do any query/response payload transformation leaving it up to consumer or higher level search strategies giving max flexibility
  • It is using async search APIs to support long running queries running beyond timeout by polling the results
  • it can support search sessions, but they are turned off now because additional work needs to be done for search sessions to support SQL search SQL search strategy should support sessions  #127880

Comparing to canvas's existing SQL search strategy:

  • It doesn't do additional request/response transformation
  • It doesn't do server-side pagination internally, because it doesn't align with async search and polling process

Next steps after this pr

  • Work with [unified search] team on higher-level building blocks for their needs (replacement of SearchSource, expression, etc...)
  • Work with [canvas] team to see if we can migrate to using this sql strategy as a low-level primitive for their strategy: is async SQL stable enough to switch canvas to use it? Also figure out how to avoid having second sql expression function for this strategy
  • See what's next for search session before finishing this strategy to support sessions

@Dosant Dosant added Feature:Search Querying infrastructure in Kibana Team:AppServicesSv v8.2.0 release_note:skip Skip the PR/issue when compiling release notes labels Mar 17, 2022
@Dosant Dosant changed the title D/2022 03 14 sql ss [Search] SQL search strategy Mar 17, 2022
@Dosant Dosant marked this pull request as ready for review March 17, 2022 16:32
@Dosant Dosant requested a review from a team as a code owner March 17, 2022 16:32
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-services (Team:AppServicesSv)

@Dosant
Copy link
Contributor Author

Dosant commented Mar 18, 2022

@elasticmachine merge upstream

@Dosant
Copy link
Contributor Author

Dosant commented Mar 18, 2022

@elasticmachine merge upstream

@Dosant
Copy link
Contributor Author

Dosant commented Mar 21, 2022

@elasticmachine merge upstream

Copy link
Member

@lukasolson lukasolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some minor feedback below, overall looking very good!

One thing I noticed was that the error handling varied depending on the error I got:

If I include an index that doesn't exist, I get this:

Screen Shot 2022-03-21 at 2 08 59 PM

For incorrect syntax (such as SELECT * FROM logs-*, without quoting "logs-"):

image

In the latter scenario, it looks like the response does in fact contain useful information that should be bubbled up:

"parsing_exception: [parsing_exception] Reason: line 1:22: mismatched input '-' expecting {<EOF>, ',', 'ANALYZE', 'ANALYZED', 'AS', 'CATALOGS', 'COLUMNS', 'CURRENT_DATE', 'CURRENT_TIME', 'CURRENT_TIMESTAMP', 'DAY', 'DEBUG', 'EXECUTABLE', 'EXPLAIN', 'FIRST', 'FORMAT', 'FULL', 'FUNCTIONS', 'GRAPHVIZ', 'GROUP', 'HAVING', 'HOUR', 'INNER', 'INTERVAL', 'JOIN', 'LAST', 'LEFT', 'LIMIT', 'MAPPED', 'MINUTE', 'MONTH', 'NATURAL', 'OPTIMIZED', 'ORDER', 'PARSED', 'PHYSICAL', 'PIVOT', 'PLAN', 'RIGHT', 'RLIKE', 'QUERY', 'SCHEMAS', 'SECOND', 'SHOW', 'SYS', 'TABLES', 'TEXT', 'TOP', 'TYPE', 'TYPES', 'VERIFY', 'WHERE', 'YEAR', LIMIT_ESC, IDENTIFIER, DIGIT_IDENTIFIER, QUOTED_IDENTIFIER, BACKQUOTED_IDENTIFIER}"

</EuiPageContent>
</EuiPageBody>
);
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How simple would it be to add a datagrid with the table results? I think it would be a nice addition. (Or maybe being able to swap the view between a datagrid and the raw response.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conversion is already implemented in canvas essql search strategy. The plan was to offer a higher level abstraction that will allow passing in kibana query (Filter[], TimeRange, Query[]} and some additional options and will return Datatable.
We shouldn't promote people using this low level stragey, preferably we should be the only consumers imo. What do you think about making this just our internal implementation detail ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or maybe being able to swap the view between a datagrid and the raw response.)

I think Lukas just meant here to improve UI of an example plugin?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, meant to leave a comment here but GitHub wasn't cooperating at the time. Yes, I just meant improve the UI. We can take care of this in a separate PR if we decide it's helpful.

export const SQL_SEARCH_STRATEGY = 'sql';

export type SqlRequestParams =
| Omit<SqlQueryRequest, 'wait_for_completion_timeout' | 'keep_alive' | 'keep_on_completion'>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we wouldn't allow overriding the value for wait_for_completion_timeout by consumers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I agree that allowing wait_for_completion_timeout wait for completion makes sense since it doesn't interfere with search sessions. Fixing

keep_alive: '1m',
}),
};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there enough overlap between this and the ese request utils file to warrant combining them? It seems like the parameters are almost identical. Do the different request types make this more difficult than it is worth?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are quite different. I think combining them would bring more complexity than benefit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is honestly basically a nit, but I just worry that in the future when we make changes to how we handle keep_alive, wait_for_completion_timeout, and keep_on_completion, we'll have to make them in two places.

What are your thoughts of doing something like this:

import { getDefaultAsyncSubmitParams as getEsAsyncSubmitParams, getDefaultAsyncGetParams as getEsAsyncGetParams } from '../../ese_search/request_utils';

export function getDefaultAsyncSubmitParams(
  searchSessionsConfig: SearchSessionsConfigSchema | null,
  options: ISearchOptions
): Pick<SqlQueryRequest, 'keep_alive' | 'wait_for_completion_timeout' | 'keep_on_completion'> {
  const useSearchSessions = searchSessionsConfig?.enabled && !!options.sessionId;

  const { wait_for_completion_timeout, keep_on_completion, keep_alive } = getEsAsyncSubmitParams(searchSessionsConfig, options);
  return { wait_for_completion_timeout, keep_on_completion, keep_alive };
}

export function getDefaultAsyncGetParams(
  searchSessionsConfig: SearchSessionsConfigSchema | null,
  options: ISearchOptions
): Pick<SqlGetAsyncRequest, 'keep_alive' | 'wait_for_completion_timeout'> {
  return getEsAsyncSubmitParams(searchSessionsConfig, options);
}

Or better yet, pulling these parameters that we know that every async strategy is going to use into a higher level request_utils.ts. (I'm noticing even EQL search strategy re-uses these from ese_search/request_utils.)

Copy link
Contributor Author

@Dosant Dosant Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it 🚀

I suggest we do it separately and review it carefully. Started a draft here #128358
I suggest we should extract polling/async/session-related stuff into common/async_utils and then use it in every async strategy.

I suggest we merge this and then I finish with the draft

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌 Sounds great

isRunning: response.is_running,
...(warning ? { warning } : {}),
};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same feedback here as above... can we use the method from ese_search/response_utils here?

Copy link
Contributor Author

@Dosant Dosant Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very different, ese_search/response_utils just has { total, loaded }; and sql doesn't have it. But I don't think is worth it to share anything here

});
});
});
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang, it would be nice if we could add another test here that used something similar to shard_delay for SQL requests to verify that the workflow works for submit/poll until completion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I don't think it is available now. Created an issue for elasticsearch to help us elastic/elasticsearch#85214

@Dosant
Copy link
Contributor Author

Dosant commented Mar 22, 2022

One thing I noticed was that the error handling varied depending on the error I got:

@lukasolson, thanks! should be fixed. An issue in the example itself

@Dosant Dosant requested a review from lukasolson March 22, 2022 11:51
Copy link
Member

@ppisljar ppisljar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Dosant
Copy link
Contributor Author

Dosant commented Mar 23, 2022

@elasticmachine merge upstream

Copy link
Member

@lukasolson lukasolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Dosant
Copy link
Contributor Author

Dosant commented Mar 24, 2022

@elasticmachine merge upstream

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
data 533 535 +2

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
data 2840 2844 +4

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
kibana 310 313 +3

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
data 459.4KB 459.5KB +98.0B
Unknown metric groups

API count

id before after diff
data 3453 3457 +4

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@Dosant Dosant merged commit 0421f86 into elastic:main Mar 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Search Querying infrastructure in Kibana release_note:skip Skip the PR/issue when compiling release notes v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[search] align sql search strategy with ese
6 participants