Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IllegalArgumentException in 03_Recommendations_Part2 #2

Open
laura-arditti opened this issue Jan 21, 2022 · 0 comments
Open

IllegalArgumentException in 03_Recommendations_Part2 #2

laura-arditti opened this issue Jan 21, 2022 · 0 comments

Comments

@laura-arditti
Copy link

The cell titled as

Next, use full text search and Personalized PageRank to find interesting articles for different authors:

results in the following error:

ClientError: [Procedure.ProcedureCallFailed] Failed to invoke procedure `gds.pageRank.stream`: Caused by: java.lang.IllegalArgumentException: Source nodes do not exist in the in-memory graph: ['105328', '118756', ... ]

I believe this is due to the fact that, in the proposed query reported below, the personalized pagerank algorithm uses source nodes that are not included in the set of nodes of the anonymous projection.

query = """
MATCH (a:Author {name: $author})<-[:AUTHOR]-(article)-[:CITED]->(other)
WITH a, collect(article) + collect(other) AS sourceNodes
CALL gds.pageRank.stream({
  nodeQuery: 'CALL db.index.fulltext.queryNodes("articles", $searchTerm)
   YIELD node, score
   RETURN id(node) as id',
  relationshipQuery: 'MATCH (a1:Article)-[:CITED]->(a2:Article) 
   RETURN id(a1) as source,id(a2) as target', 
  sourceNodes: sourceNodes,
  validateRelationships:false,
  parameters: {searchTerm: $searchTerm}})
YIELD nodeId, score
WITH gds.util.asNode(nodeId) AS n, score
WHERE not(exists((a)<-[:AUTHOR]-(n))) AND score > 0
RETURN n.title as article, score, [(n)-[:AUTHOR]->(author) | author.name][..5] AS authors
order by score desc limit 10
"""

I was able to obtain the same results as pictured in the cell's original output by slightly altering the query as follows:

query = """
MATCH (a:Author {name: $author})<-[:AUTHOR]-(article)-[:CITED]->(other)
WITH a, collect(article) + collect(other) AS sourceNodes
CALL db.index.fulltext.queryNodes("articles", $searchTerm)
   YIELD node, score
WITH a, sourceNodes, collect(id(node)) AS ids
CALL gds.pageRank.stream({
  nodeQuery: 'UNWIND $ids AS id 
  RETURN id',
  relationshipQuery: 'MATCH (a1:Article)-[:CITED]->(a2:Article) 
   RETURN id(a1) as source,id(a2) as target', 
  sourceNodes: [article IN sourceNodes WHERE id(article) IN ids | article],
  validateRelationships:false,
  parameters: {ids: ids, searchTerm: $searchTerm}
 })
YIELD nodeId, score
WITH gds.util.asNode(nodeId) AS n, score
WHERE not(exists((a)<-[:AUTHOR]-(n))) AND score > 0
RETURN n.title as article, score, [(n)-[:AUTHOR]->(author) | author.name][..5] AS authors
order by score desc limit 10
"""

The behaviour of the query is the same but only sourceNodes present in the anonymous projection are used as sources in the pagerank algorithm.

I'm using neo4j Desktop at the following versions:

Product Version
neo4j 4.3.1
APOC 4.3.0.4
GDS 1.6.1

Thanks for the great course and I hope you find this useful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant