-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Duplicated hierarchies #1260
Conversation
...we should retrieve the name of actual hierarchy, not nextInHierarchy
...as there are only ~3 legacy hierarchies on LINDAS PROD, it makes sense to remove them.
...as otherwise we also include hierachies retrieved through DESCRIBEing the cube, which are not part of the shape.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't see anything out of the ordinary, but also don't have much experience with the cube queries.
@bprusinowski LGTM 👍 |
…into fix/hierarchies-querying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 LGTM, I think it's good to have a splitted issue with the future of ExtendedCube. Thanks !
const cube = await loaders.cube.load(iri); | ||
const rawCube = await loaders.cube.load(iri); | ||
// Currently we always default to the latest cube. | ||
const cube = await getLatestCube(rawCube); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the splitted issue, getLatestCube could be put on ExtendedCube.
import rdf from "rdf-ext"; | ||
import DatasetExt from "rdf-ext/lib/Dataset"; | ||
|
||
export class ExtendedCube extends Cube { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cna you add a docstring describing the general goal ? So that when we see a method that should be there, we can be sure that this class is the right one ?
} | ||
return []; | ||
|
||
return uniqBy(hierarchies, (hierarchy) => getName(hierarchy, locale)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the splitted issue : getDimensionHierarchies could be on ExtendedCube. This way we will hide the complexity of the shapePtr there.
Fixes #1257.
Closes #1263.
This PR:
hasHierarchy
), as there is only one such hierarchy on LINDAS,Context
It looks like there's too many hierarchies fetched in the example chart from #1257 (https://visualize.admin.ch/de/v/IYdcAI0tH0Io?dataSource=Prod).
There are two issues visible in this chart:
double.selection.mov
(Apparent) duplicated hierarchies
First issue occurs due to retrieving name from
hierarchyPointer.out(ns.cubeMeta.nextInHierarchy)
instead ofhierarchyPointer
directly. @ptbrowne do you remember the context here (it looks like this has been done to avoid duplication)? After looking at docs and investigating the data on LINDAS, I think we should base directly onhierarchyPointer
– if we do it, the name of hierarchies are unique (Schweiz - Produktionsregion - Wirtschaftsregion
andSchweiz - Produktionsregion
).Hierarchy that (most probably) shouldn't be fetched at all
Second one is more weird – generally, when we fetch a cube, two queries are fired, one for cube and the other one for cube shape. It looks like this behavior results in more hierarchies that we're actually interested in, as (I think, also based on kronmar's query), we should only fetch hierarchies that are specified in the cube's shape.
I didn't investigate enough to say for sure why these additional hierarchies are being fetched along with
cube
, but if we just fetchcube shape
, it retrieves correct hierarchies.It's a bit unfortunate, as basically we need to fetch shape twice, once initially, and afterwards again when fetching hierarchies, as it looks we can't distinguish if a given triple came fromcube
orshape
. Some optimisation might be possible, but I think it would need to be somehow connected with changes tocube-view-query
library, to be able to tell if a given triple came from one or the other query.I introduced an
ExtendedCube
class to allow us to store shape triples insideshapePtr
, thus removing a need of fetching the shape twice. It might be useful to introduce such improvement directly tocube-view-query
library, or to better understand why there is hierarchy mismatch when fetching cube and shape.cc @Rdataflow