-
Notifications
You must be signed in to change notification settings - Fork 147
ORDER BY Performance #239
Comments
If you run this query manually in Neo4j browser. Which of the 7 fields causes the main slowdown? Can you also share your graphql query? Would it be possible to combine the queries into one field that has multiple results (returns a map). Do you have indexes on deleted/disabled? I would also recommend to move deleted/disabled to labels as those are much faster to fetch. then you could use queries like:
If you're on 3.5 having an index on created at can help too for the ordering if you add a filter that determines the type. |
@johnymontana to enable index-backed-order-by in 3.5 we could add a sort hint, aka. |
Hey folks, I investigated the issue on our side and was able to track down the performance issue: Generated query: MATCH (`post`:`Post` {deleted:false, disabled:false}) RETURN `post` { .deleted , .disabled , .createdAt , .id , .id , .title , .contentExcerpt , .createdAt , .disabled , .deleted , .slug , .image ,author: head([(`post`)<-[:`WROTE`]-(`post_author`:`User`) | post_author { .deleted , .disabled , .createdAt , .id , .id , .avatar , .slug , .name , .disabled , .deleted ,contributionsCount: apoc.cypher.runFirstColumn("MATCH (this)-[:WROTE]->(r:Post)\nWHERE (NOT exists(r.deleted) OR r.deleted = false)\nAND (NOT exists(r.disabled) OR r.disabled = false)\nRETURN COUNT(r)", {this: post_author, cypherParams: {currentUserId: null}}, false),shoutedCount: apoc.cypher.runFirstColumn("MATCH (this)-[:SHOUTED]->(r:Post) WHERE NOT r.deleted = true AND NOT r.disabled = true RETURN COUNT(DISTINCT r)", {this: post_author, cypherParams: {currentUserId: null}}, false),commentsCount: apoc.cypher.runFirstColumn("MATCH (this)-[:WROTE]->(r:Comment) WHERE NOT r.deleted = true AND NOT r.disabled = true RETURN COUNT(r)", {this: post_author, cypherParams: {currentUserId: null}}, false),followedByCount: apoc.cypher.runFirstColumn("MATCH (this)<-[:FOLLOWS]-(r:User) RETURN COUNT(DISTINCT r)", {this: post_author, cypherParams: {currentUserId: null}}, false),followedByCurrentUser: apoc.cypher.runFirstColumn("MATCH (this)<-[:FOLLOWS]-(u:User {id: {currentUserId: null}.currentUserId})\nRETURN COUNT(u) >= 1", {this: post_author, cypherParams: {currentUserId: null}}, false),location: head([ post_author_location IN apoc.cypher.runFirstColumn("MATCH (this)-[:IS_IN]->(l:Location) RETURN l", {this: post_author, cypherParams: {currentUserId: null}}, true) | post_author_location { .id }]) ,badges: [(`post_author`)<-[:`REWARDED`]-(`post_author_badges`:`Badge`) | post_author_badges { .id , .id , .key , .icon }] }]) ,commentsCount: apoc.cypher.runFirstColumn("MATCH (this)<-[:COMMENTS]-(r:Comment) WHERE NOT r.deleted = true AND NOT r.disabled = true RETURN COUNT(r)", {this: post, cypherParams: {currentUserId: null}}, false),categories: [(`post`)-[:`CATEGORIZED`]->(`post_categories`:`Category`) | post_categories { .id , .id , .name , .icon }] ,shoutedCount: apoc.cypher.runFirstColumn("MATCH (this)<-[:SHOUTED]-(r:User) WHERE NOT r.deleted = true AND NOT r.disabled = true RETURN COUNT(DISTINCT r)", {this: post, cypherParams: {currentUserId: null}}, false)} AS `post` ORDER BY `post`.createdAt DESC SKIP 0 LIMIT 10 Query without any performance issue: MATCH (`post`:`Post` {deleted:false, disabled:false}) WITH `post` ORDER BY `post`.createdAt DESC RETURN `post` { .deleted , .disabled , .createdAt , .id , .id , .title , .contentExcerpt , .createdAt , .disabled , .deleted , .slug , .image ,author: head([(`post`)<-[:`WROTE`]-(`post_author`:`User`) | post_author { .deleted , .disabled , .createdAt , .id , .id , .avatar , .slug , .name , .disabled , .deleted ,contributionsCount: apoc.cypher.runFirstColumn("MATCH (this)-[:WROTE]->(r:Post)\nWHERE (NOT exists(r.deleted) OR r.deleted = false)\nAND (NOT exists(r.disabled) OR r.disabled = false)\nRETURN COUNT(r)", {this: post_author, cypherParams: {currentUserId: null}}, false),shoutedCount: apoc.cypher.runFirstColumn("MATCH (this)-[:SHOUTED]->(r:Post) WHERE NOT r.deleted = true AND NOT r.disabled = true RETURN COUNT(DISTINCT r)", {this: post_author, cypherParams: {currentUserId: null}}, false),commentsCount: apoc.cypher.runFirstColumn("MATCH (this)-[:WROTE]->(r:Comment) WHERE NOT r.deleted = true AND NOT r.disabled = true RETURN COUNT(r)", {this: post_author, cypherParams: {currentUserId: null}}, false),followedByCount: apoc.cypher.runFirstColumn("MATCH (this)<-[:FOLLOWS]-(r:User) RETURN COUNT(DISTINCT r)", {this: post_author, cypherParams: {currentUserId: null}}, false),followedByCurrentUser: apoc.cypher.runFirstColumn("MATCH (this)<-[:FOLLOWS]-(u:User {id: {currentUserId: null}.currentUserId})\nRETURN COUNT(u) >= 1", {this: post_author, cypherParams: {currentUserId: null}}, false),location: head([ post_author_location IN apoc.cypher.runFirstColumn("MATCH (this)-[:IS_IN]->(l:Location) RETURN l", {this: post_author, cypherParams: {currentUserId: null}}, true) | post_author_location { .id }]) ,badges: [(`post_author`)<-[:`REWARDED`]-(`post_author_badges`:`Badge`) | post_author_badges { .id , .id , .key , .icon }] }]) ,commentsCount: apoc.cypher.runFirstColumn("MATCH (this)<-[:COMMENTS]-(r:Comment) WHERE NOT r.deleted = true AND NOT r.disabled = true RETURN COUNT(r)", {this: post, cypherParams: {currentUserId: null}}, false),categories: [(`post`)-[:`CATEGORIZED`]->(`post_categories`:`Category`) | post_categories { .id , .id , .name , .icon }] ,shoutedCount: apoc.cypher.runFirstColumn("MATCH (this)<-[:SHOUTED]-(r:User) WHERE NOT r.deleted = true AND NOT r.disabled = true RETURN COUNT(DISTINCT r)", {this: post, cypherParams: {currentUserId: null}}, false)} AS `post` SKIP 0 LIMIT 10 Here's the diff: Query plan before (terrible performance):Query plan after:Suggested FixI had a look into the code and the naive fix would be a re-ordering of this line of code. Does it have any unwanted side effects to move the I would be happy to provide a PR if it's that easy. |
I think it makes sense in general as the sort order is kept. What version are you running on? Please provide a PR that would be great! |
I think it makes sense in general as the sort order is kept. What version are you running on? Please provide a PR that would be great! We could even move the skip limit as the filtering is happening before in We should do that in the other transpilers too. |
Thanks @roschaefer and @appinteractive for digging into this! The only possible issue I see there is when we are ordering on a In general ordering on the computed fields is not a good idea anyway for performance reasons, but would be good to make sure we don't break that use case since it is supported now. |
Any idea how to make both possible? Is there a way to decide based on the field type (computed/not computed) ? |
I think we could accomplish that by adding some logic in Then in translate.js it would be something like this:
|
Also, this brings up the need of having some basic performance tests as part of the test suite so we can at least track performance regressions/improvements. I'll create a separate issue for that. |
@mattwr18 @aonomike You must never `ORDER BY` a property with a `@cypher` directive. Reason: The order by performance will be terribly poor. See my issue: neo4j-graphql/neo4j-graphql-js#239 And my PR: neo4j-graphql/neo4j-graphql-js#247
@mattwr18 @aonomike You must never `ORDER BY` a property with a `@cypher` directive. Reason: The order by performance will be terribly poor. See my issue: neo4j-graphql/neo4j-graphql-js#239 And my PR: neo4j-graphql/neo4j-graphql-js#247
Hey @johnymontana we have real issues with performace when using order by in combination with dynamic fields as they are applied to ALL results BEFORE the order by! That leads to an query execution time of 30s in our case.
Without the order by it works fine! @johnymontana Is there any way we can work around that issue without throwing two queries at the api? (one to get the id's in sorted and paginated way and a second to get the data by id's)
Here is the generated query:
The text was updated successfully, but these errors were encountered: