Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Optimize Browse page queries #1236

Merged
merged 48 commits into from
Oct 31, 2023
Merged
Changes from 1 commit
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
6143c45
fix: Allow dataset tags to wrap
bprusinowski Oct 25, 2023
c82f421
perf: Do not query dataset counts
bprusinowski Oct 25, 2023
5621193
perf: Do not use two queries when searching for cubes
bprusinowski Oct 25, 2023
4ed5fdf
fix: Point to a correct graph to retrieve creator labels
bprusinowski Oct 25, 2023
3d23a44
perf: Use scalar for nested objects (GQL SearchCubes)
bprusinowski Oct 25, 2023
a6220e5
style: Prevent layout shift when draft tag is added
bprusinowski Oct 26, 2023
ea5501e
fix: Types
bprusinowski Oct 26, 2023
6e39852
refactor: Use scalar for SearchCube
bprusinowski Oct 26, 2023
8a11fb2
fix: Do not measure timing twice
bprusinowski Oct 26, 2023
f8fbe2f
perf: Optimize cube search query
bprusinowski Oct 26, 2023
fbba91b
fix: Order cubes by version history and by desc(iri) to select newest…
bprusinowski Oct 26, 2023
467a17a
fix: Types
bprusinowski Oct 26, 2023
15d0b76
fix: Do not filter by subthemes on the server side
bprusinowski Oct 26, 2023
e4c0278
fix: Tests
bprusinowski Oct 26, 2023
a3b4a8f
refactor: Clean up
bprusinowski Oct 26, 2023
ac5e6ed
fix: Missing text input value after refresh
bprusinowski Oct 26, 2023
3143679
fix: Type
bprusinowski Oct 26, 2023
91a298c
fix: Removing browse theme and organization filters
bprusinowski Oct 26, 2023
826ebeb
feat: Allow subthemes to be displayed when organization is second sec…
bprusinowski Oct 26, 2023
a2b4aef
fix: Selecting subthemes when both sections are filtered
bprusinowski Oct 26, 2023
bde860c
fix: Do not clear the cubes when new query is being executed
bprusinowski Oct 26, 2023
d1eff45
fix: Only update browse state if router is ready
bprusinowski Oct 26, 2023
4f65f10
docs: Update CHANGELOG
bprusinowski Oct 26, 2023
a726792
fix: Include subthemes when calculating cube search scores
bprusinowski Oct 27, 2023
95341b1
perf: Specify graph when querying themes
bprusinowski Oct 27, 2023
986e842
perf: Specify graph for subthemes
bprusinowski Oct 27, 2023
7b9c508
chore: Comment out published date datatype condition for now
bprusinowski Oct 27, 2023
4b9e50e
fix: Keep multi-lang filtering
bprusinowski Oct 27, 2023
d16b2b9
fix: Tests
bprusinowski Oct 27, 2023
6b1cedc
test: Add more E2E search tests
bprusinowski Oct 27, 2023
41353f6
fix: creativeWorkStatus has to be set
bprusinowski Oct 27, 2023
c02a7d2
style: Animate filters title
bprusinowski Oct 27, 2023
7631f7d
perf: Do not fire cubes, themes and organizations queries on dataset …
bprusinowski Oct 27, 2023
e2de8ca
fix: Do not filter publishers by language
bprusinowski Oct 27, 2023
3ed4efc
perf: Further optimize cube search query
bprusinowski Oct 30, 2023
a01b3f7
feat: Improve GQLDebugPanel
bprusinowski Oct 30, 2023
e90b2a2
fix: Search query keywords
bprusinowski Oct 30, 2023
b4e87bc
fix: Copy button placement
bprusinowski Oct 30, 2023
d8c50c2
fix: Show published and draft versions of cubes with the same version…
bprusinowski Oct 30, 2023
8d87a82
perf: Optimize search query
bprusinowski Oct 30, 2023
4652385
perf: Remove version history-related logic from SearchCubes query
bprusinowski Oct 30, 2023
e708814
perf: Optimize search cube queries by concatenating themes and sub-th…
bprusinowski Oct 30, 2023
d4117a6
fix: No results (search cubes)
bprusinowski Oct 31, 2023
67c6bdc
chore: Typo
bprusinowski Oct 31, 2023
056c640
fix: Theme filtering needs to happen in HAVING part of the query
bprusinowski Oct 31, 2023
7815ab2
perf: Do not query themes and organizations separately
bprusinowski Oct 31, 2023
0497a97
fix: Exclude topic when constructing remove URL (search filters nav i…
bprusinowski Oct 31, 2023
edb4e91
perf: Drop HAVING in favor of direct filtering
bprusinowski Oct 31, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions app/rdf/query-search.ts
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ export const searchCubes = async ({
?.filter((x) => x.type === "DataCubeOrganization")
.map((v) => v.value) ?? [];

const scoresQuery = SELECT`
let scoresQuery = SELECT`
?iri ?title ?status ?datePublished ?description ?publisher ?creatorIri ?creatorLabel
(GROUP_CONCAT(DISTINCT ?themeIri; SEPARATOR="${GROUP_SEPARATOR}") AS ?themeIris) (GROUP_CONCAT(DISTINCT ?themeLabel; SEPARATOR="${GROUP_SEPARATOR}") AS ?themeLabels)
(GROUP_CONCAT(DISTINCT ?subthemeIri; SEPARATOR="${GROUP_SEPARATOR}") AS ?subthemeIris) (GROUP_CONCAT(DISTINCT ?subthemeLabel; SEPARATOR="${GROUP_SEPARATOR}") AS ?subthemeLabels)
Expand Down Expand Up @@ -152,7 +152,7 @@ export const searchCubes = async ({
)}
}
}
${makeInFilter("creatorIri", creatorValues)}
${creatorValues.length ? makeInFilter("creatorIri", creatorValues) : ""}

OPTIONAL {
?iri ${ns.dcat.theme} ?themeIri .
Expand All @@ -166,7 +166,6 @@ export const searchCubes = async ({
})}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curiosity : Why do we need another graph for themes ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better for the performance if we narrow down the graph, so it doesn't need to look everywhere to find the things? @Rdataflow might have a better understanding here :)

Copy link
Contributor

@Rdataflow Rdataflow Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bprusinowski well understood 😄 it's about narrowing down the lookup to one GRAPH (a.k.a. one partition of the SPARQL database)

}
${makeInFilter("themeIri", themeValues)}

# Add more subtheme termsets here when they are available
${
Expand Down Expand Up @@ -238,11 +237,16 @@ export const searchCubes = async ({
)`
ptbrowne marked this conversation as resolved.
Show resolved Hide resolved
bprusinowski marked this conversation as resolved.
Show resolved Hide resolved
: ""
}

`.GROUP().BY`?iri`.THEN.BY`?title`.THEN.BY`?status`.THEN.BY`?datePublished`
.THEN.BY`?description`.THEN.BY`?publisher`.THEN.BY`?creatorIri`.THEN
.BY`?creatorLabel`.prologue`${pragmas}`;

if (themeValues.length) {
scoresQuery = scoresQuery.HAVING`${themeValues
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bprusinowski what was the issue with FILTER() ? - can we find a fix for that one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we removed organizations and themes queries, when we filter by e.g. Administration theme, we still need to retrieve all themes attached to a given cube, not only the Administration theme, as this would distort the left filter panel – that's why we can't use FILTER, but rather retrieve all themes, and only filter by concatenated string afterwards 👀

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it doesn't look like it affected the performance in any noticeable way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, you don't want to loose the other themes on the selected cubes, right?
Instead of bad performing HAVING you may state the theme i.e. ?iri dcat:theme <https://register.ld.admin.ch/opendataswiss/category/agriculture> .
this will bring the desired filtering effect, not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, you are right – I updated the query 🎉

.map((iri) => `CONTAINS(LCASE(?themeIris), LCASE("${iri}"))`)
.join(" || ")}` as any;
}

const scoreResults = await scoresQuery.execute(sparqlClient.query, {
operation: "postUrlencoded",
});
Expand Down