-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Internal row structure for partial filtering / sorting and lazy-loading #4851
Comments
@m4theushw @cherniavskii @DanailH @alexfauquette I think this is a major change that we need to prepare for v6 (whatever structure we decide to take, even if its very different to what I describe here) if we want to have a good data management with lazy loading and row grouping. |
Great summary @flaviendelangle. I think it's good to start discussing this topic now as it will also impact the way the lazy loading will work initially.
Just to be sure I understand the problem correctly - we would need to know which row is a parent of a group in advance, correct? Because if it is regarding the children can't we put fake rows inside the group, for example, skeleton rows which will have to be inserted either way? |
Not sure to understand, sorry if my answer misses your point. The problem for row grouping with lazy loading is that I need to know the list of the groups without having the rows to group them. For instance, the clients says "I have 3 release year 2022, 2021 and 2020" without giving me one movie of each group. We could of course ask for the list of groups, then create fake rows that would fit into these groups, then create the Row Grouping tree, and we would have those groups. These groups could have auto generated children (for skeleton loading for instance). My main goal is to create a real notion of group, to be able to apply logic to a group (sorting / filtering / aggregation / lazy loading / ...) It's a step forward in the differentiation between the data and the visible rows. On one side you have the data where everything is by group and you can trigger logic on specific groups or on all (i.e on the top level group). On the other side, you have the visible rows, which are flat, make no difference between a real row, a footer, a grouping row etc... |
Side note: my RFC does not cover the impact on the sorting and filtering state. filteredRows: { [groupId: string]: { [rowId: GridRowId]: boolean } } That way we can easily re-run the sorting / filtering on some of the groups. |
I have few minor suggestions to the state structure above: @@ -8,7 +8,7 @@ interface GridRowConfig {
id: GridRowId;
/**
- * The id of the group in which this row id
+ * The id of the group in which this row is
*/
groupId: string;
@@ -18,15 +18,17 @@ interface GridRowConfig {
depth: number;
}
+type GridRowGroupId = string;
+
/**
* Equivalent of the current GridRowTreeNodeConfig for groups
*/
interface GridRowGroup {
/**
- * The id of the group. It the group is created from a real row, it is the ID of the row.
+ * The id of the group. If the group is created from a real row, it is the ID of the row.
* If not, it is an auto-generated ID
*/
- id: string;
+ id: GridRowGroupId;
/**
* If the group is created from a real row (in the Tree Data), then we store its id.
@@ -42,7 +44,7 @@ interface GridRowGroup {
/**
* The id of the groups children of the current group
*/
- childrenGroups?: string[];
+ childrenGroups?: GridRowGroupId[];
/**
* The id of the rows children of the current group
@@ -76,5 +78,5 @@ type GridRowConfigLookup = Record<GridRowId, GridRowConfig>;
type GridRowGroupTree = { [groupId: string]: GridRowGroup };
interface GridRowsState {
- tree: GridRowTree;
+ tree: GridRowGroupTree;
} Does this make sense? |
Looking at the attributes of Have you thought about borrowing the concept of |
The wording is of course open to discussion
It would be a massive change but it's probably worth experimenting. Something like this ? interface GridTreeNode {
id: GridRowId;
depth: number;
children?: GridRowId[];
parent: GridRowId;
sortedChildren?: GridRowId[];
isPassingFilters?: boolean;
isCollapsed?: boolean // if true the row is part of a collapsed group and therefore not visible
}
interface GridRowNode extends GridTreeNode {
type: 'data-row',
}
interface GridGroupNode extends GridTreeNode {
type: 'group',
groupingKey: GridKeyValue
groupingField: string | null
// Should we also store the aggregated values here ?
}
// The footer for the aggregation, they are not filtered nor sorted
interface GridFooterNode extrends GridTreeNode {
type: 'footer',
} Unlike today, we would have a I think we should have a listing of the groups (without the data rows) in one place @cherniavskii yes your changes make sense, we will clearly have to clean this interface to make it as clear as possible |
One drawback of storing the filtering like I show above is that it has a higher cost to reset the filter. |
It does not shock me that resetting filtering takes as much time as applying a new one. About having to loop on each row, we are a bit trapped by the virtualization which requires computing the position of each rows in the page with a reducer. So as soon as you modify the sorting/filtering we will have to loop on the remaining rows in the correct order to know the y coordinate of those rows |
The rows of the current page |
More or less, I would go further and not store the children as array but as a pointer to the first child, as below: interface GridTreeNode {
id: GridRowId;
depth: number;
child?: GridTreeNode; // the first row of a group, the second row goes into the nextNode of the first
nextNode?: GridTreeNode; // the sibling row
parent?: GridTreeNode;
isPassingFilters?: boolean;
isCollapsed?: boolean // if true the row is part of a collapsed group and therefore not visible
}
interface GridRowNode extends GridTreeNode {
type: 'data-row',
}
interface GridGroupNode extends GridTreeNode {
type: 'group',
groupingKey: GridKeyValue
groupingField: string | null
// Should we also store the aggregated values here ?
}
// The footer for the aggregation, they are not filtered nor sorted
interface GridFooterNode extrends GridTreeNode {
type: 'footer',
}
Yeah, but is it important to reset the filter faster than to apply the filter? If that's a problem we can have a second tree with all nodes visible, then we only switch it. I just think that the idea of using a tree for everything is better than to intersect a few states to get which rows should be visible and in which order. |
That's an even more massive change |
Some early feedbacks on #4927 I implemented a 1st version of the new structure, with partial tree update, but without the partial filtering / sorting / aggregation (it's hard to split #4927 into small PRs but partial filtering / sorting / aggregation can probably be implemented one by one).
New structureTthe JSDoc need to be improved Treeexport type GridTreeNode = GridLeafNode | GridGroupNode | GridFooterNode;
export type GridRowTreeConfig = Record<GridRowId, GridTreeNode>;
export interface GridTreeBasicNode {
/**
* The uniq id of this node.
*/
id: GridRowId;
/**
* Depth of this node in the tree.
*/
depth: number;
}
export interface GridLeafNode extends GridTreeBasicNode {
type: 'leaf';
/**
* The row id of the group containing this node.
*/
parent: GridRowId;
/**
* The key used to group the children of this row.
* Only used in the tree data, may-be renamed leafKey.
*/
groupingKey: GridKeyValue | null;
}
export interface GridGroupNode extends GridTreeBasicNode {
type: 'group';
/**
* If `true`, this node has been automatically generated by the grid.
* In the row grouping, all groups are auto-generated
* In the tree data, some groups can be passed in the rows
*/
isAutoGenerated: boolean;
/**
* The key used to group the children of this row.
*/
groupingKey: GridKeyValue | null;
/**
* The field used to group the children of this row.
* Is `null` if no field has been used to group the children of this row.
*/
groupingField: string | null;
/**
* The id of the children nodes.
*/
children: GridRowId[];
/**
* The id of the children nodes, grouped by grouping field and grouping key.
* This key is the main addition on the tree.
* To be able to do partial update with good performances, we need to be able to easily convert the path of grouping criteria into a path of ids to now where to put the new rows.
*/
childrenFromPath: {
[groupingField: string]: {
[groupingKey: string]: GridRowId;
};
};
/**
* If `true`, the children of this group are not visible.
* @default false
*/
childrenExpanded?: boolean;
/**
* The row id of the group containing this node (null for the root group).
*/
parent: GridRowId | null;
}
export interface GridFooterNode extends GridTreeBasicNode {
type: 'footer';
/**
* The row id of the group containing this node.
*/
parent: GridRowId;
} We will add properties for sorting / filtering and maybe aggregation on those nodes later if we go for @m4theushw approach (which I would be in favor of doing). Row stateexport interface GridRowsLookups {
dataRowIdToModelLookup: GridRowsLookup;
dataRowIdToIdLookup: Record<string, GridRowId>;
autoGeneratedRowIdToIdLookup: Record<string, GridRowId>;
}
export interface GridRowTreeCreationValue {
/**
* Name of the algorithm used to group the rows
* It is useful to decide which filtering / sorting algorithm to apply, to avoid applying tree-data filtering on a grouping-by-column dataset for instance.
*/
groupingName: string;
tree: GridRowTreeConfig;
treeDepth: number;
autoGeneratedRowIdToIdLookup: Record<string, GridRowId>;
}
export interface GridRowsState extends GridRowTreeCreationValue, GridRowsLookups {
/**
* Matches the value of the `loading` prop.
*/
loading?: boolean;
/**
* Amount of rows before applying the filtering.
* It also counts the expanded and collapsed children rows.
*/
totalRowCount: number;
/**
* Amount of rows before applying the filtering.
* It does not count the expanded children rows.
*/
totalTopLevelRowCount: number;
/**
* Tree returned by the `rowTreeCreation` strategy processor.
* It is used to re-apply the `hydrateRows` pipe processor without having to recreate the tree.
* Unclear if we will still need it
*/
groupingResponseBeforeRowHydration: GridRowTreeCreationValue;
} |
We don't need to store in |
True, I will just rename it to always use the |
This is a draft to start discussing about this subject and prepare for v6
Introduction
We currently store the rows in a flat structure (in
state.rows.ids
) and then create a tree from it if needs be.Any update to the rows is at first an update to the flat row list followed by a full re-generation of the tree
This has several implications:
When adding / removing / modifying a row, the grid always triggers a re-generation of the whole tree
When adding / removing / modifying a row, the grid always re-applies the sorting, filtering and aggregation on all the rows, even those in groups not impacted at all by the modification
When toggling a group expansion, the grid always re-applies the filtering on all the rows
In the tree, the group are represented by there parent row. But for the top level rows, there is no notion of "top level group" since the top level rows do not have a parent. In [data grid] Implement Aggregation (not publicly released) #4208, I had to invent a fake row ID to store aggregated data of the top level rows.
In the tree, the group are represented by there parent row. It is therefore impossible to do lazy loading with Row Grouping without having at least one row of each group on the initial fetch.
The auto generated rows (grouping rows for Tree Data or Row Grouping and aggregation footer) are regular rows in the state. They can be updated / removed by
apiRef.current.updateRows
which makes no sense.Note: It is unclear to me exactly how far we can optimize to avoid re-computing sorting / filtering / aggregation on non-impacted groups. But I am sure that we can do non-negligible optimizations and that we need a better internal state structure for it.
In summary, our current row state is not suited for large datasets, especially when it comes to Row Grouping because any change in the dataset triggers computation on the whole dataset, even when it is a localized update.
Proposal
AG-Grid paragraph
In the presentation below, row means "real row". The auto generated row are no longer stored as regular row in the state. They can't be updated / removed using the same API as the regular row.
We do have a slight nomenclature issue because the word row can have two meanings.
This can create confusion, but AG-Grid also calls the data given by the user "rows" so it's probably not that problematic.
New state structure
Tree generation
Only fully re-generate the tree when
props.rows
changes orapiRef.current.setRows
is called (new dataset)When calling
apiRef.current.updateRows
, insert / remove / update the rows inside the current tree (without mutating of course)Each row tree generator is responsible for handling these changes (we currently have 3: Flat / Row Grouping / Tree Data)
Examples:
For the Row Grouping and Flat trees, the behavior is quite simple. For the Tree Data it becomes a lot more complicated since the groups are derived from real rows.
Sorting / Filtering
Sorting and filtering now listens to the
rowGroupChange
event which takes for argument the list of the group ids that were modified during the last row updateWe may need to pass more information about this update (for instance "group X: row removed") to fine tune the update. For instance do not re-apply sorting on a group where we just removed rows.
This do not need to be done in the 1st version, the format must just be compatible with future improvements.
Related issues
The text was updated successfully, but these errors were encountered: