[RFC]collect the only used properties in match #2657

Shylock-Hg · 2021-05-26T08:56:14Z

Summary

Collect the only used properties in match sentence execution.

Motivation

Reduce the memory consumption and the data read/transform.

Usage explanation

It should be completely transparent in usage of match sentence.

Design explanation

Now we could choose the properties of tag/edge when query them from storage(GetProp, GetNeighbor). The main issue in match is that there is no tag.

For this, I proposal select the properties when generate plan.

In detail, validator extend the vertex properties to real tag properties by the schema in current space, and then do the query as before. For example, if we should get vertex properties (p1, p2), extend it to tag2(p1) and tag4(p1, p2), then do the data query as before with the specified tag properties.

In this case, we just add the tag/properties lookup and extension. The others process is same as before.

Rationale and alternatives

We could select the only one property in all tag in current graph space when generate plan. But this contains error in case there are same name properties in different tags. In this case, if we choose one property of one tag, and scan this tag, we will ignore the property in another tag and maybe it's just attached to vertex in fact but what we chosen don't.

Drawbacks

Prior art

Unresolved questions

There is some mistake if one vertex contains multiple same named properties. But it's the conflict between open cypher and nebula data model. And this proposal keep the origin behavior in this case.

For this, maybe we could only document it and let user take attention.

Future possibilities

czpmango · 2021-05-27T02:39:28Z

Why not do it in the optimization phase? It sounds like an optRule called ColumnPruning?

Shylock-Hg · 2021-05-27T03:14:15Z

Why not do it in the optimization phase? It sounds like an optRule called ColumnPruning?

Which properties we need is collected from ast.

Solidified tomli version to solve centos7 compatibility issues Co-authored-by: George <[email protected]>

Shylock-Hg self-assigned this May 26, 2021

CPWstatic transferred this issue from vesoft-inc/nebula-graph Aug 27, 2021

CPWstatic added type/enhancement Type: make the code neat or more efficient need to discuss Solution: issue or PR without a clear conclusion on whether to handle it labels Aug 27, 2021

Shylock-Hg closed this as completed Feb 18, 2022

jamieliu1023 mentioned this issue Feb 19, 2022

Weekly Report 2022-02-18 vesoft-inc/nebula-community#96

Closed

yixinglu pushed a commit to yixinglu/nebula that referenced this issue Sep 14, 2023

Update requirements.txt (vesoft-inc#2657)

fb784c9

Solidified tomli version to solve centos7 compatibility issues Co-authored-by: George <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC]collect the only used properties in match #2657

[RFC]collect the only used properties in match #2657

Shylock-Hg commented May 26, 2021

czpmango commented May 27, 2021

Shylock-Hg commented May 27, 2021

[RFC]collect the only used properties in match #2657

[RFC]collect the only used properties in match #2657

Comments

Shylock-Hg commented May 26, 2021

Summary

Motivation

Usage explanation

Design explanation

Rationale and alternatives

Drawbacks

Prior art

Unresolved questions

Future possibilities

czpmango commented May 27, 2021

Shylock-Hg commented May 27, 2021