Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC]collect the only used properties in match #2657

Closed
Shylock-Hg opened this issue May 26, 2021 · 2 comments
Closed

[RFC]collect the only used properties in match #2657

Shylock-Hg opened this issue May 26, 2021 · 2 comments
Assignees
Labels
need to discuss Solution: issue or PR without a clear conclusion on whether to handle it type/enhancement Type: make the code neat or more efficient

Comments

@Shylock-Hg
Copy link
Contributor

Summary

Collect the only used properties in match sentence execution.

Motivation

Reduce the memory consumption and the data read/transform.

Usage explanation

It should be completely transparent in usage of match sentence.

Design explanation

Now we could choose the properties of tag/edge when query them from storage(GetProp, GetNeighbor). The main issue in match is that there is no tag.

For this, I proposal select the properties when generate plan.

In detail, validator extend the vertex properties to real tag properties by the schema in current space, and then do the query as before. For example, if we should get vertex properties (p1, p2), extend it to tag2(p1) and tag4(p1, p2), then do the data query as before with the specified tag properties.

In this case, we just add the tag/properties lookup and extension. The others process is same as before.

Rationale and alternatives

We could select the only one property in all tag in current graph space when generate plan. But this contains error in case there are same name properties in different tags. In this case, if we choose one property of one tag, and scan this tag, we will ignore the property in another tag and maybe it's just attached to vertex in fact but what we chosen don't.

Drawbacks

Prior art

Unresolved questions

There is some mistake if one vertex contains multiple same named properties. But it's the conflict between open cypher and nebula data model. And this proposal keep the origin behavior in this case.

For this, maybe we could only document it and let user take attention.

Future possibilities

@Shylock-Hg Shylock-Hg self-assigned this May 26, 2021
@czpmango
Copy link
Contributor

Why not do it in the optimization phase? It sounds like an optRule called ColumnPruning?

@Shylock-Hg
Copy link
Contributor Author

Why not do it in the optimization phase? It sounds like an optRule called ColumnPruning?

Which properties we need is collected from ast.

@CPWstatic CPWstatic transferred this issue from vesoft-inc/nebula-graph Aug 27, 2021
@CPWstatic CPWstatic added type/enhancement Type: make the code neat or more efficient need to discuss Solution: issue or PR without a clear conclusion on whether to handle it labels Aug 27, 2021
yixinglu pushed a commit to yixinglu/nebula that referenced this issue Sep 14, 2023
Solidified tomli version to solve centos7 compatibility issues

Co-authored-by: George <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need to discuss Solution: issue or PR without a clear conclusion on whether to handle it type/enhancement Type: make the code neat or more efficient
Projects
None yet
Development

No branches or pull requests

3 participants