Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregations #218

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Aggregations #218

wants to merge 1 commit into from

Conversation

thobe
Copy link
Contributor

@thobe thobe commented Apr 13, 2017

Aims to solve #188.

CIP2017-04-13

@thobe
Copy link
Contributor Author

thobe commented Apr 13, 2017

This is still an early draft.

ToDo:

  • Handle ordering of aggregated data.
  • Handle parameterized aggregation.

@Mats-SX
Copy link
Member

Mats-SX commented Apr 18, 2017

Related CIR: #219

.Aggregation using `collect`
----
MATCH (person:Person)-[:FRIEND]-(friend)
RETURN person.email, collect OF friend {.name,.email} AS friends
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the OF syntax, but collect is the operation I would like to change if we proceed with OF. collect is a verb, but a noun fits better here, which makes me think we should rename this to collection, or even list:

MATCH (person:Person {name: $name})
RETURN list OF person.age

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having just read through, I was thinking exactly the same thing, both "nounifying" the term and switching to list. It strikes me that this also highlights the return type of the operation.

[source, cypher]
.Aggregation using `percentileCont`
----
BREAKS DOWN IN THIS SYNTAX
Copy link
Member

@Mats-SX Mats-SX Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could consider allowing arguments to the aggregator on the left-hand side of the OF?

MATCH (n)
RETURN percentileCont($percentile) OF n.property

Copy link

@petraselmer petraselmer Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice ^^

How about - so we have a few options to give food for thought - a parentheses-free version:

MATCH (n)
RETURN percentileCont OF n.property GIVEN $percentile

I also played around with using WITH instead of GIVEN, but that is way too overloaded.

If we ever have aggregations with more than one argument of this type, e.g. myAgg(expression, arg1, arg2) in today's syntax, these are the options we'd have (continuing with the same examples as above):

  1. ... RETURN myAgg($arg1, @arg2) OF n.property
  2. ... RETURN myAgg OF n.property GIVEN $arg1, @arg2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I thought about similar things on my way home after having pushed this. What I thought of then was that, at least in this case, the parameter describes which particular aggregate value to get, so perhaps a subscript operator would be appropriate RETURN percentileCont[0.4] OF n.property

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the general - or multi-arg - case would be RETURN myAgg[$arg1, $arg2]? Or, is this not really 'general-izable'?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you bracket the expression on the RHS of OF? For example:

RETURN map OF (key, value)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@technige Interesting idea. Are you considering the case when both key and value are parameters that change across records, or when only one of them are?

I kind of like using OF as a separator between arguments that are read once versus arguments that are the actual substance of the aggregation.

[source, cypher]
.Aggregation using `count`
----
MATCH (nodes) RETURN count OF nodes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For count(*), are you considering count OF *?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had not thought of that, but that seems sensible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been wondering if we shouldn't just mandate always writing a kind-of list comprehension syntax for aggregation, to make more visible what's going on

collect [ n.prop | * ]
max [ n.weight | * ]

this would allow re-using aggregation function calling syntax with regular lists!

count [ n | n IN [ 1, 2, 3, 4 ] ]

or even shorter

count [ n | 1, 2, 3, 4 ]

I agree, we should just use aggrF(args) ... to pass in args to the aggregation function.

But I think it's important to have syntax that allows us to call aggregation functions for aggregations as well as over ordinary lists.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That syntax seems pretty far away from what this CIP suggests. Could we adapt it to a better fit?

Just allow any expression after the OF, where only property expressions and expressions that evaluate to a list are valid:

max OF n.weight 
max OF [1, 2, 3, 4] // semantics depend on expression value

Although this is not ideal when considering list properties...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some similar ideas are discussed in the Python generator expression PEP -> https://www.python.org/dev/peps/pep-0289/

@Mats-SX
Copy link
Member

Mats-SX commented Apr 18, 2017

Also related to #220

@petraselmer
Copy link

I think more examples showing the interplay between aggregations and other functions would be good, along with showing nested operations - pretty much along the lines of the examples shown in #219

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants