-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bitnine Tech] - When it executes VLE(Variable Length Edge) query, why it occurs an error just the first try? #365
Comments
Thank you for your question! For any issues like this, please provide the following information to allow us to look into it -
Unfortunately, until I know more, I cannot give you an answer. john |
The information tested is as follows.
Nested Loop (cost=0.01..1860999881.03 rows=33830636617 width=64) (actual time=3.128..9990.038 rows=1 loops=1) I hope this answer has been sufficient. Hyundong |
Does the command error if you don't use EXPLAIN at all? |
@jrgemignani I get the same error, even if I don't use EXPLAIN |
@Hyundong-Seo, I am unable to reproduce your issue. So, I will need more information -
For that id - I should note that if it is from the vertex's id, you need to use the id() function -
|
@jrgemignani
When I tried it again, but it was executed today. |
That happens sometimes and it is hard to figure out why. If you happen to find something like this again, that is repeatable with a specific dataset, please let us know. |
I know this is closed. but I am getting the same issue. After a long time of having no problem with VLE requests just today - on one instance of my database and not on another a request with * on the edge request causes the ERROR: "insert_vertex_edge: failed to insert" to appear. This occurs on a variety of requests that usually work just fine. I have found that if I establish an ongoing session e.g. using psql that I get this error the first time the request is run. Subsequent runs give the correct result without an error. Requests going in through my App API always fail because each is run in a new session. Why would running the query twice fix the problem? |
@avowkind What version of AGE are you using? What PG version? |
From the code it looks to only give that specific error if the start or end vertex for the edge don't exist at the time of the vle command execution. |
is it worth creating a new bug report? Describe the bug It is possible to get the database into a state where a request of the form: SELECT * FROM cypher('fishpond', $$
MATCH (p:PropertyType)<-[edge:typeOf*]-(node:PropertyType)
WHERE p.id = 'fish_weight_g'
RETURN node, edge
$$) as ( node agtype, edge agtype); returns the error
While the same request with [edge:typeOf] i.e no variable length path on the edge succeeds However, a second attempt in the same session will succeed. Is is an instance of #365 ( which was closed with insufficient information). How are you accessing AGE (Command line, driver, etc.)?
What data setup do we need to do?
What is the necessary configuration info needed?
What is the command that caused the error? SELECT * FROM cypher('fishpond', $$
MATCH (p:PropertyType)<-[edge:typeOf*]-(node:PropertyType)
WHERE p.id = 'fish_weight_g'
RETURN node, edge
$$) as ( node agtype, edge agtype);
Expected behavior
|
@avowkind If you can provide a relatively simple dataset with a repeatable example, that would be really helpful in debugging this :) |
I can provide the database backup (its 18mb) but its not an especially complicated model. |
Anything that is repeatable is helpful. I wrote the VLE code and have thoroughly tested it and never seen that error message. So, I'm a bit blind to what might be going on. |
@avowkind I reopened this issue. Hopefully, we'll get to the bottom of it this time :) |
@avowkind Are there updates to the vertices and edges going on in the background or in another process when this happens? |
are there updates to the vertices and edges running? Short video showing the issue. Apache.Age.bug.mp4However I do have the gut feeling that this is something to do with a query or activity that left the db in an unusual state. Is it possible to have an edge without a node for example? But why would this work on the second try? |
@avowkind The second try makes sense. Try the following right before executing the command -
For you that would be just -
Oh, and paste in the output. |
When run for the first time in the session I get a result of 'false'. The following query still fails. in fact it ensures that the script fails every time instead of just the first time. |
Further to the 2nd try element in the following code the delete_global_graphs ensures that the PropertyType query fails. but the following Feature query succeeds although it is on a different entity and edge type.
|
@avowkind That it fails consistently when using that function actually helps narrow it down. |
@avowkind I know why it fails and then works - Technically speaking, it shouldn't work as that error is a fatal error, that's why it fails the first run. However, the context isn't cleaned up, so it works the second run because a context already exists and it doesn't need to build a new one. As to why it is getting an edge that doesn't have a valid start or end id and why that edge it got doesn't exist,... I'm still looking into that. For me, it is failing on an edge found in "featureOfInterest" -
|
To clarify
This [edge:featureOfInterest] is the link between an observation and a feature.
Might it be the case that we have an observation that has been lost or deleted somehow without the edge being removed. e.g .if someone did MATCH DELETE without the DETACH?
Or is it the feature end that is missing?
Thanks Andrew
… On 11 Apr 2024, at 8:41 AM, John Gemignani ***@***.***> wrote:
***@***.***(https://github.com/avowkind) I know why it fails and then works -
Technically speaking, it shouldn't work as that error is a fatal error, that's why it fails the first run. However, the context isn't cleaned up, so it works the second run because a context already exists and it doesn't need to build a new one. As to why it is getting an edge that doesn't have a valid start or end id and why that edge it got doesn't exist,... I'm still looking into that.
For me, it is failing on an edge found in "featureOfInterest" -
(gdb) p {edge_id, edge_vertex_start_id, edge_vertex_end_id}
$4 = {3096224744137124, 2533274790715812, 1407374883553780}
fishpond=# SELECT * FROM cypher('fishpond', $$MATCH ()-[edge:featureOfInterest]->() where id(edge) > 3096224744137122 and id(edge) < 3096224744137126 RETURN edge ORDER BY id(edge) $$) as (edge agtype);
edge
------------------------------------------------------------------------------------------------------------------------------------
------
{"id": 3096224744137123, "label": "featureOfInterest", "end_id": 1407374883553780, "start_id": 2533274790715811, "properties": {}}:
:edge
{"id": 3096224744137125, "label": "featureOfInterest", "end_id": 1407374883555256, "start_id": 2533274790715813, "properties": {}}:
:edge
(2 rows)
fishpond=#
—
Reply to this email directly, [view it on GitHub](#365 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AAMFYFLOHEUCESYI3TEYUXDY4WPY3AVCNFSM6AAAAAASIWFLLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBYGQYDCMRRGU).
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Using that query I get
"{""id"": 3096224744137123, ""label"": ""featureOfInterest"", ""end_id"": 1407374883553780, ""start_id"": 2533274790715811, ""properties"": {}}::edge"
"{""id"": 3096224744137124, ""label"": ""featureOfInterest"", ""end_id"": 1407374883553780, ""start_id"": 2533274790715812, ""properties"": {}}::edge"
"{""id"": 3096224744137125, ""label"": ""featureOfInterest"", ""end_id"": 1407374883555256, ""start_id"": 2533274790715813, ""properties"": {}}::edge”
Using this modified version
SELECT * FROM cypher('fishpond', $$
MATCH (o)-[edge:featureOfInterest]->(f)
where id(edge) > 3096224744137122
and id(edge) < 3096224744137126
RETURN o, edge, f ORDER BY id(edge) $$) as (o agtype, edge agtype, f agtype);
I get only 2 results
"{""id"": 2533274790715811, ""label"": ""Observation"", ""properties"": {""id"": ""fde14e35"", ""result"": 6, ""parameters"": {""filename"": ""Class 18 2021 feeding records Nov 21 -.xlsx"", ""sheetname"": ""South pen (mix)""}, ""resultTime"": ""2022-03-20T00:00:00Z"", ""phenomenonTime"": ""2022-03-20T00:00:00Z"", ""observedProperty"": ""mortality_fish_weight_g"", ""featureOfInterest"": ""pfr:nelson:tank:C09""}}::vertex""{""id"": 3096224744137123, ""label"": ""featureOfInterest"", ""end_id"": 1407374883553780, ""start_id"": 2533274790715811, ""properties"": {}}::edge""{""id"": 1407374883553780, ""label"": ""Feature"", ""properties"": {""id"": ""pfr:nelson:tank:C09"", ""name"": ""C09"", ""relations"": [{""id"": ""pfr:nelson:tank_group:C"", ""relation"": ""partOf""}], ""description"": ""C09"", ""featureGroup"": ""pfr:fg:tank""}}::vertex"
"{""id"": 2533274790715813, ""label"": ""Observation"", ""properties"": {""id"": ""3de89087"", ""result"": 750, ""parameters"": {""Comments"": ""2mm Japanese and 2.3mm skretting"", ""filename"": ""Class 18 2021 feeding records Nov 21 -.xlsx"", ""food_type"": ""2.3mm EP02 Otohime (g)"", ""sheetname"": ""South pen (mix) (2)""}, ""resultTime"": ""2022-03-25T00:00:00Z"", ""phenomenonTime"": ""2022-03-25T00:00:00Z"", ""observedProperty"": ""fish_food_g"", ""featureOfInterest"": ""pfr:nelson:tank:seapen_south""}}::vertex""{""id"": 3096224744137125, ""label"": ""featureOfInterest"", ""end_id"": 1407374883555256, ""start_id"": 2533274790715813, ""properties"": {}}::edge""{""id"": 1407374883555256, ""label"": ""Feature"", ""properties"": {""id"": ""pfr:nelson:tank:seapen_south"", ""name"": ""Beatrix Bay South"", ""relations"": [{""id"": ""pfr:nelson:tank_group:beatrix_bay"", ""relation"": ""partOf""}], ""description"": ""South Pen"", ""featureGroup"": ""pfr:fg:tank""}}::vertex”
So the "{""id"": 3096224744137124, ""label"": ""featureOfInterest"", ""end_id"": 1407374883553780, ""start_id"": 2533274790715812, ""properties"": {}}::edge”. Is the problem one.
Checking the start ID It does not exist.
select * from cypher('fishpond', $$
MATCH (o)
WHERE id(o) = 2533274790715812
RETURN o $$) as (o agtype);
While changing the id to 811 or 813 we get a result.
So we have a single observation record that is missing and a dangling edge.
Deleting the edge
SELECT * FROM cypher('fishpond', $$
MATCH ()-[edge:featureOfInterest]->()
where id(edge) = 3096224744137124
DELETE edge
RETURN edge ORDER BY id(edge) $$) as (edge agtype);
However I think there must be multiple examples of these deleted observations that have dangling edges.
What would a cypher query look like to find them?
Thanks Andrew
… On 11 Apr 2024, at 10:45 AM, Andrew Watkins ***@***.***> wrote:
To clarify
This [edge:featureOfInterest] is the link between an observation and a feature.
Might it be the case that we have an observation that has been lost or deleted somehow without the edge being removed. e.g .if someone did MATCH DELETE without the DETACH?
Or is it the feature end that is missing?
Thanks Andrew
> On 11 Apr 2024, at 8:41 AM, John Gemignani ***@***.***> wrote:
>
> ***@***.***(https://github.com/avowkind) I know why it fails and then works -
>
> Technically speaking, it shouldn't work as that error is a fatal error, that's why it fails the first run. However, the context isn't cleaned up, so it works the second run because a context already exists and it doesn't need to build a new one. As to why it is getting an edge that doesn't have a valid start or end id and why that edge it got doesn't exist,... I'm still looking into that.
>
> For me, it is failing on an edge found in "featureOfInterest" -
>
> (gdb) p {edge_id, edge_vertex_start_id, edge_vertex_end_id}
> $4 = {3096224744137124, 2533274790715812, 1407374883553780}
>
> fishpond=# SELECT * FROM cypher('fishpond', $$MATCH ()-[edge:featureOfInterest]->() where id(edge) > 3096224744137122 and id(edge) < 3096224744137126 RETURN edge ORDER BY id(edge) $$) as (edge agtype);
> edge
>
> ------------------------------------------------------------------------------------------------------------------------------------
> ------
> {"id": 3096224744137123, "label": "featureOfInterest", "end_id": 1407374883553780, "start_id": 2533274790715811, "properties": {}}:
> :edge
> {"id": 3096224744137125, "label": "featureOfInterest", "end_id": 1407374883555256, "start_id": 2533274790715813, "properties": {}}:
> :edge
> (2 rows)
>
> fishpond=#
>
> —
> Reply to this email directly, [view it on GitHub](#365 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AAMFYFLOHEUCESYI3TEYUXDY4WPY3AVCNFSM6AAAAAASIWFLLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBYGQYDCMRRGU).
> You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@avowkind The odd thing is that, at least for me, the edge does not exist when I do a select. Yet, the load logic clearly gets that edge, which doesn't seem to exist. This is a weird case for sure. |
Only the end vertex for the edge exists.
|
This query
SELECT * FROM cypher('fishpond', $$
MATCH ()-[edge:featureOfInterest]->()
where id(edge) = 3377699720847780
RETURN edge ORDER BY id(edge) $$) as (edge agtype);
Returns null
But this
SELECT *
FROM fishpond._ag_label_edge e
WHERE id::text = '3377699720847780’;
Return s
"3377699720847780""2533274790715812""2251799813685250""{}"
Where vertex 2533274790715812 does not exist.
So the above delete also doesn’t work.
Here is SQL to list dangling edges
SELECT e.*
FROM fishpond._ag_label_edge e
LEFT JOIN fishpond._ag_label_vertex v ON e.start_id = v.id
WHERE v.id IS NULL;
Let me know if you think this is right. - I get one result - the known issue.
So I then run
DELETE FROM fishpond._ag_label_edge e
WHERE id::text = '3377699720847780’;
Removing the line from the edge table.
And this fixes everything. The VLE search now runs ok.
So to sum up.
We have a dangling edge in ._ag_label_edge where the start_id does not exist or was somehow removed from the vertices table.
The check for dangling edges is
Here is SQL to list dangling edges
SELECT e.*
FROM fishpond._ag_label_edge e
LEFT JOIN fishpond._ag_label_vertex v ON e.start_id = v.id
WHERE v.id IS NULL;
And then you can delete an edge using
DELETE FROM fishpond._ag_label_edge e
WHERE id::text = '{id}'
I had to cast the id from a graphID.
The underlying question is whether a query of the form
MATCH. x pattern
DELETE x
Without the DETACH leaves dangling edges by design.
Thanks Andrew
… On 11 Apr 2024, at 11:36 AM, John Gemignani ***@***.***> wrote:
Only the end vertex for the edge exists.
fishpond=# SELECT * FROM cypher('fishpond', $$ MATCH (u) WHERE id(u) = 2533274790715812 RETURN u $$) as (u agtype);
u
---
(0 rows)
fishpond=# SELECT * FROM cypher('fishpond', $$ MATCH (u) WHERE id(u) = 1407374883553780 RETURN u $$) as (u agtype);
u
------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------
{"id": 1407374883553780, "label": "Feature", "properties": {"id": "pfr:nelson:tank:C09", "name": "C09", "relations": [{"id": "pfr:n
elson:tank_group:C", "relation": "partOf"}], "description": "C09", "featureGroup": "pfr:fg:tank"}}::vertex
(1 row)
fishpond=#
—
Reply to this email directly, [view it on GitHub](#365 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AAMFYFMAUKHHYSA7LOBBMFLY4XEGHAVCNFSM6AAAAAASIWFLLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBYGYYDENJZGQ).
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@avowkind Yeah, the VLE goes on the assumption that everything is set up correctly, meaning every edge that exists has vertices that exist. If it comes across issues, it just quits because it might be a data corruption issue. I'm going to put a PR together to have the VLE highlight these cases when the VLE loads. This way a user can find them. I will also fix it so that it just gives warning messages and continues. Thoughts? |
Thanks - this was a good debug experience.
I think my key underlying question is about whether it’s possible to create a dangling edge via cypher queries alone?
If not we would have to look at whether the vertex got deleted by some other route or something happened during creation.
Also given that the request was for VLE for paths like this
SELECT * FROM cypher('fishpond', $$
MATCH (p:PropertyType)<-[edge:typeOf*0..]-(node:PropertyType)
WHERE p.id = 'fish_weight_g'
RETURN node, edge
$$) as ( node agtype, edge agtype);
I wonder that we went anywhere near edges that link to observations.
While there are only a few 100 items in the property graph and a few 1000 in the feature graph there will be millions in the observation graph.
Thanks Andrew
… On 11 Apr 2024, at 12:13 PM, John Gemignani ***@***.***> wrote:
***@***.***(https://github.com/avowkind) Yeah, the VLE goes on the assumption that everything is set up correctly, meaning every edge that exists has vertices that exist. If it comes across issues, it just quits because it might be a data corruption issue.
I'm going to put a PR together to have the VLE highlight these cases when the VLE loads. This way a user can find them. I will also fix it so that it just gives warning messages and continues.
Thoughts?
—
Reply to this email directly, [view it on GitHub](#365 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AAMFYFKZAQP2N55C4YTWPHTY4XITRAVCNFSM6AAAAAASIWFLLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBYGYZTCMRQGU).
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Yes, it was :)
This has to do with how the VLE load and context system works. In the beginning, I tried to limit what it loaded. However, that ended up requiring more contexts, more reloads, and more complexity. So, I opted for just loading everything which meant a bigger context, but only one per graph and fewer loads. The big difference with the VLE is that it has a graph representation, not tables. This allows graph algorithms to work more directly with it. |
@avowkind I have modified the VLE to help identify these "dangling" edges -
The remainder of the output was left out. But, the query will run with this modification. I may change some of the wording before it goes to a PR. |
@avowkind PG15 should have a docker build that is updated now. |
@avowkind All supported branches are now updated. As this is now understood and resolved, I will close the ticket. |
The first time I try the VLE query, I always get an insert_vertex_edge error.
A second attempt at the same query executes the query.
The text was updated successfully, but these errors were encountered: