Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non unique vertices with path-finding cause internal error #139

Closed
bburky opened this issue Aug 30, 2024 · 2 comments · Fixed by #140
Closed

Non unique vertices with path-finding cause internal error #139

bburky opened this issue Aug 30, 2024 · 2 comments · Fixed by #140
Labels
bug Something isn't working

Comments

@bburky
Copy link

bburky commented Aug 30, 2024

Non unique vertex values (pointed to by DESTINATION KEY (...) REFERENCES ) in graph cause ANY SHORTEST v-[e]-> +(v) queries to fail with:

INTERNAL Error: Attempted to access index 1 within vector of size 1
This error signals an assertion failure within DuckDB. This usually occurs due to unexpected conditions or errors in the program's logic.
For more information, see https://duckdb.org/docs/dev/internal_errors

Tested on macOS with duckdb v1.0.0 and the current community version of duckpgq (6c06589). Also reproduced with a build from duckpgq-extension source at commit 6d483c5.

I didn't see any other currently open "INTERNAL Error" issues, so I'm opened this issue. This may be a dupe of the "Primary keys are not unique" issue in #26?

I'm guessing I did something wrong (is the REFERENCES target required to be unique? Edges must point to a single specific vertex? I was able to resolve this by fixing my data and using unique ids for edges) I didn't expect this error to cause be a fatal internal error though.

This project looks really promising, I was just hoping to try it out and play with SQL/PGQ a bit. The integration with duckdb is very nice.

Minimal reproduction:

CREATE TABLE v (x VARCHAR);
INSERT INTO v VALUES ('a');
INSERT INTO v VALUES ('b');
INSERT INTO v VALUES ('b');

CREATE TABLE e (x1 VARCHAR, x2 VARCHAR);
INSERT INTO e VALUES ('a', 'b');

LOAD duckpgq;

CREATE PROPERTY GRAPH g
VERTEX TABLES (
    v
)
EDGE TABLES (
    e
        SOURCE KEY (x1) REFERENCES v (x)
        DESTINATION KEY (x2) REFERENCES v (x)
);

-- v-[e]->(v) has no error:
-- Output has duplicate `x` records with the value `b` returned as expected. They can be distinguished by rowid in vertices()
FROM GRAPH_TABLE(g
  MATCH p =(v1:v)-[e:e]->(v2:v)
  COLUMNS (vertices(p), v2.x)
);

-- ANY SHORTEST v-[e]->(v) has no error:
-- Output again has duplicate `x` records are returned as expected
FROM GRAPH_TABLE(g
  MATCH p = ANY SHORTEST (v1:v)-[e:e]->(v2:v)
  COLUMNS (path_length(p), vertices(p), v2.x)
);

-- ANY SHORTEST v-[e]-> +(v) fails with "INTERNAL Error: Attempted to access index 1 within vector of size 1"
FROM GRAPH_TABLE(g
  MATCH p = ANY SHORTEST (v1:v)-[e:e]-> +(v2:v)
  COLUMNS (path_length(p), vertices(p), v2.x)
);

-- ANY SHORTEST v-[e]->{1,2}(v) also fails with "INTERNAL Error: Attempted to access index 1 within vector of size 1"
FROM GRAPH_TABLE(g
  MATCH p = ANY SHORTEST (v1:v)-[e:e]->{1,2}(v2:v)
  COLUMNS (path_length(p), vertices(p), v2.x)
);
@Dtenwolde Dtenwolde added the bug Something isn't working label Aug 30, 2024
@Dtenwolde
Copy link
Contributor

Hi, thank you for submitting the issue :) I was able to reproduce it using your example.
The duplicate IDs in the vertex table seem to lead to an incorrect creation for the CSR index we use for path-finding.
I always assumed unique IDs in the vertex table in my tests. If this is not the case it should at least give a nicer error or remove duplicate vertices.
I'll look into it :)

@Dtenwolde Dtenwolde changed the title Non unique vertex values (pointed to by DESTINATION KEY (...) REFERENCES ) in graph cause ANY SHORTEST v-[e]-> +(v) queries to fail with "INTERNAL error" Non unique vertices cause internal error Aug 30, 2024
@Dtenwolde Dtenwolde changed the title Non unique vertices cause internal error Non unique vertices with path-finding cause internal error Aug 30, 2024
@bburky
Copy link
Author

bburky commented Aug 30, 2024

Ok, that's what I figured after playing with it a bit more.

Hopefully you'll find a good solution. One of the nice things about the duckdb integration is I could go directly from loading JSON to querying that data in a graph. Sadly duckdb doesn't seem to provide an easy way to put a unique constraint during JSON loading (or when I turn it into a table with CREATE TABLE AS)? I guess I could add a UNIQUE INDEX after creating the table, that may not too bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants