-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for s3 fields in discover #8609
Add support for s3 fields in discover #8609
Conversation
❌ Empty Changelog SectionThe Changelog section in your PR description is empty. Please add a valid changelog entry or entries. If you did add a changelog entry, check to make sure that it was not accidentally included inside the comment block in the Changelog section. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #8609 +/- ##
=======================================
Coverage 60.85% 60.86%
=======================================
Files 3793 3793
Lines 90368 90442 +74
Branches 14181 14198 +17
=======================================
+ Hits 54997 55045 +48
- Misses 31893 31909 +16
- Partials 3478 3488 +10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While most of the comments are abou code changes that dont affect users and can be fast followed, there are 2 user facing bugs that we need to address:
- There are 3 requests fired when both fields are being loaded and the query is running. I can reproduce this the easiest when ive selected an s3 datasource and refreshed the page. See screenshot
- The available fields just dint stop spinning when the page refreshed even once the requests are successful. See secind screenshot
src/plugins/data/common/index_patterns/index_patterns/index_pattern.ts
Outdated
Show resolved
Hide resolved
src/plugins/data/public/query/query_string/dataset_service/dataset_service.ts
Outdated
Show resolved
Hide resolved
src/plugins/data/public/query/query_string/dataset_service/types.ts
Outdated
Show resolved
Hide resolved
src/plugins/discover/public/application/components/sidebar/discover_sidebar.tsx
Show resolved
Hide resolved
src/plugins/data/public/query/query_string/dataset_service/dataset_service.ts
Outdated
Show resolved
Hide resolved
src/plugins/discover/public/application/view_components/utils/use_index_pattern.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't finish my review as I assume there will be changes based on ashwin's comments
src/plugins/data/common/index_patterns/index_patterns/index_pattern.ts
Outdated
Show resolved
Hide resolved
2d19e68
to
5976444
Compare
Looks like Flint is breaking on main, get this when running any s3 based SQL query:
|
Debugging the commits led me this 923cce8. this commit might be the root-cause why we're seeing this weird behavior for flint datasources |
5976444
to
72f0b6e
Compare
Current status:
Video of local test with the revert dataselector changes: flint-s3-discover.mov |
Signed-off-by: Shenoy Pratik <[email protected]>
Signed-off-by: Shenoy Pratik <[email protected]>
Signed-off-by: Shenoy Pratik <[email protected]>
72f0b6e
to
000834f
Compare
Signed-off-by: Shenoy Pratik <[email protected]>
Signed-off-by: Shenoy Pratik <[email protected]>
@@ -28,6 +28,8 @@ export interface DataSourceMeta { | |||
name?: string; | |||
/** Optional session ID for faster responses when utilizing async query sources */ | |||
sessionId?: string; | |||
/** Optional supportsTimeFilter determines if a time filter is needed */ | |||
supportsTimeFilter?: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be controlled based on whether timeField
is present in the dataset rather than additional flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today even if timeField
is present we don't have support for S3 datasources to automatic time filter injection in the query. This is surely where we want to go towards in future though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some older conversations around this: #8337 (comment), #8337 (comment)
const temporaryIndexPattern = await this.indexPatterns?.create(spec, true); | ||
|
||
// Load schema asynchronously if it's an async index pattern | ||
if (asyncType && temporaryIndexPattern) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does async index patterns mean the metadata like field schema is not stored in the index pattern saved object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Async index patterns means that the fields are lazy loaded; i.e. the temporary index pattern is created with empty fields and when the fields are loaded in the index pattern gets hydrated with the fields.
@@ -80,6 +83,11 @@ export const Configurator = ({ | |||
setTimeFields(dateFields || []); | |||
}; | |||
|
|||
if (baseDataset?.dataSource?.meta?.supportsTimeFilter === false) { | |||
setTimeFields([]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this []
creates a different object every time when called, and will trigger a re-render. might be better to add a guard
setTimeFields([]); | |
if (timeFields.length > 0) setTimeFields([]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here: 53198c4
<EuiSplitPanel.Inner | ||
className="eui-yScroll dscSideBar_fieldListContainer" | ||
paddingSize="none" | ||
> | ||
{fields.length > 0 && ( | ||
{(fields.length > 0 || selectedIndexPattern.fieldsLoading) && ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if loading and fields
list is empty, what does UI show? should it indicate that it's loading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If loading and fields list is empty -> The UI shows the sidebar in the loading state.
@@ -312,6 +313,7 @@ const FieldList = ({ | |||
size="xs" | |||
className="dscSideBar_fieldGroup" | |||
aria-label={title} | |||
isLoading={selectedIndexPattern.fieldsLoading ?? false} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isLoading={selectedIndexPattern.fieldsLoading ?? false} | |
isLoading={selectedIndexPattern.fieldsLoading} |
or coerce to boolean !!selectedIndexPattern.fieldsLoading
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coerce makes sense: 53198c4
export interface SQLQueryResponse { | ||
status: string; | ||
schema: Array<{ name: string; type: string }>; | ||
datarows: Array<Array<string | null>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be
datarows: Array<Array<string | null>>; | |
datarows: Array<Array<unknown>>; |
i think columns in SQL response is not always string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here: 53198c4
const sessionId = (dataset.dataSource?.meta as DataStructureCustomMeta).sessionId; | ||
const response = await http.fetch({ | ||
method: 'POST', | ||
path: trimEnd(`${API.DATA_SOURCE.ASYNC_JOBS}`), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit.
path: trimEnd(`${API.DATA_SOURCE.ASYNC_JOBS}`), | |
path: trimEnd(API.DATA_SOURCE.ASYNC_JOBS), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here: 53198c4
fetchStatus: () => | ||
http.fetch({ | ||
method: 'GET', | ||
path: trimEnd(`${API.DATA_SOURCE.ASYNC_JOBS}`), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path: trimEnd(`${API.DATA_SOURCE.ASYNC_JOBS}`), | |
path: trimEnd(API.DATA_SOURCE.ASYNC_JOBS), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here: 53198c4
Signed-off-by: Shenoy Pratik <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't pull down to test but from code perspective lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't have time to pull this down to test, but approving assuming bugs Ashwin brought up are resolved (which appear to be the case from comments)
Other comments are not blocking
.fetchFields(dataset, services) | ||
.then((fields) => { | ||
temporaryIndexPattern.fields.replaceAll([...fields]); | ||
this.indexPatterns?.saveToCache(dataset.id, temporaryIndexPattern); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handling of this request should ideally be tested. Can we follow up on this?
path: trimEnd(API.DATA_SOURCE.ASYNC_JOBS), | ||
body: JSON.stringify({ | ||
lang: 'sql', | ||
query: `DESCRIBE TABLE ${dataset.title}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be escaped in anyway or can we trust dataset metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When i pulled it down im seeing issues with the state. Blocking so that we dont accidentally merge it in the meantime
My bad. Still had the older commits when i rviewed. When i pulled down the latest changes, it works
Screen.Recording.2024-10-22.at.6.52.21.PM.movNoticed one bug, but wont block it because it can come in a fast follow |
* add support for s3 fields in discover Signed-off-by: Shenoy Pratik <[email protected]> * Changeset file for PR #8609 created/updated * resolve comments, make fields fetch async Signed-off-by: Shenoy Pratik <[email protected]> * fix unit tests Signed-off-by: Shenoy Pratik <[email protected]> * update services to be Partial<IDataPluginServices> Signed-off-by: Shenoy Pratik <[email protected]> * fix async field fetch in cachedataset Signed-off-by: Shenoy Pratik <[email protected]> * resolve comments Signed-off-by: Shenoy Pratik <[email protected]> --------- Signed-off-by: Shenoy Pratik <[email protected]> Co-authored-by: opensearch-changeset-bot[bot] <154024398+opensearch-changeset-bot[bot]@users.noreply.github.com> (cherry picked from commit 12d072d) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* add support for s3 fields in discover * Changeset file for PR #8609 created/updated * resolve comments, make fields fetch async * fix unit tests * update services to be Partial<IDataPluginServices> * fix async field fetch in cachedataset * resolve comments --------- (cherry picked from commit 12d072d) Signed-off-by: Shenoy Pratik <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: opensearch-changeset-bot[bot] <154024398+opensearch-changeset-bot[bot]@users.noreply.github.com>
* add support for s3 fields in discover Signed-off-by: Shenoy Pratik <[email protected]> * Changeset file for PR opensearch-project#8609 created/updated * resolve comments, make fields fetch async Signed-off-by: Shenoy Pratik <[email protected]> * fix unit tests Signed-off-by: Shenoy Pratik <[email protected]> * update services to be Partial<IDataPluginServices> Signed-off-by: Shenoy Pratik <[email protected]> * fix async field fetch in cachedataset Signed-off-by: Shenoy Pratik <[email protected]> * resolve comments Signed-off-by: Shenoy Pratik <[email protected]> --------- Signed-off-by: Shenoy Pratik <[email protected]> Co-authored-by: opensearch-changeset-bot[bot] <154024398+opensearch-changeset-bot[bot]@users.noreply.github.com>
Description
Issues Resolved
Screenshot
Screen.Recording.2024-10-16.at.8.42.02.AM.mov
Changelog
Check List
yarn test:jest
yarn test:jest_integration