[RFC]collect the only used properties in match #2657
Labels
need to discuss
Solution: issue or PR without a clear conclusion on whether to handle it
type/enhancement
Type: make the code neat or more efficient
Summary
Collect the only used properties in match sentence execution.
Motivation
Reduce the memory consumption and the data read/transform.
Usage explanation
It should be completely transparent in usage of match sentence.
Design explanation
Now we could choose the properties of tag/edge when query them from storage(GetProp, GetNeighbor). The main issue in match is that there is no tag.
For this, I proposal select the properties when generate plan.
In detail, validator extend the vertex properties to real tag properties by the schema in current space, and then do the query as before. For example, if we should get vertex properties (p1, p2), extend it to tag2(p1) and tag4(p1, p2), then do the data query as before with the specified tag properties.
In this case, we just add the tag/properties lookup and extension. The others process is same as before.
Rationale and alternatives
We could select the only one property in all tag in current graph space when generate plan. But this contains error in case there are same name properties in different tags. In this case, if we choose one property of one tag, and scan this tag, we will ignore the property in another tag and maybe it's just attached to vertex in fact but what we chosen don't.
Drawbacks
Prior art
Unresolved questions
There is some mistake if one vertex contains multiple same named properties. But it's the conflict between open cypher and nebula data model. And this proposal keep the origin behavior in this case.
For this, maybe we could only document it and let user take attention.
Future possibilities
The text was updated successfully, but these errors were encountered: