Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested @embed directive affects processing at a higher level #171

Closed
rybesh opened this issue Jun 1, 2021 · 8 comments
Closed

Nested @embed directive affects processing at a higher level #171

rybesh opened this issue Jun 1, 2021 · 8 comments

Comments

@rybesh
Copy link

rybesh commented Jun 1, 2021

Describe the bug

When I create a frame with a nested @embed directive, it seems to affect the frame processing at a higher level, which differs from the behavior of the framing processor at the JSON-LD Playground.

To Reproduce

Take for example the following graph:

{
  "@graph": [
    {
      "@id": "http://n2t.net/ark:/39333/ncg/dataset",
      "@type": "http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag",
      "member": [
        "http://n2t.net/ark:/39333/ncg/place/NCG11248",
        "http://n2t.net/ark:/39333/ncg/place/NCG07554",
        "http://n2t.net/ark:/39333/ncg/place/NCG03755"
      ]
    },
    {
      "@id": "http://n2t.net/ark:/39333/ncg/place/NCG03755",
      "@type": "http://n2t.net/ark:/39333/ncg/type#Mountain",
      "county": "http://n2t.net/ark:/39333/ncg/place/NCG11248",
      "label": "Crawford Mountain"
    },
    {
      "@id": "http://n2t.net/ark:/39333/ncg/place/NCG07554",
      "@type": "http://n2t.net/ark:/39333/ncg/type#Community",
      "county": "http://n2t.net/ark:/39333/ncg/place/NCG11248",
      "label": "Ichley"
    },
    {
      "@id": "http://n2t.net/ark:/39333/ncg/place/NCG11248",
      "@type": "http://n2t.net/ark:/39333/ncg/type#County",
      "label": "Orange County",
      "description": "Not to be confused with Orange County, CA"
    }
  ],
  "@context": {
    "label": {
      "@id": "http://www.w3.org/2004/02/skos/core#label"
    },
    "description": {
      "@id": "http://www.w3.org/2004/02/skos/core#note"
    },
    "county": {
      "@id": "http://n2t.net/ark:/39333/ncg/vocab#county",
      "@type": "@id"
    },
    "member": {
      "@id": "http://www.w3.org/2000/01/rdf-schema#member",
      "@type": "@id"
    }
  }
}

And this frame:

{
  "@context": {
    "@base": "http://n2t.net/ark:/39333/ncg/place/",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "ncv": "http://n2t.net/ark:/39333/ncg/vocab#",
    "nct": "http://n2t.net/ark:/39333/ncg/type#",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "records": {
      "@container": "@set",
      "@type": "@id",
      "@id": "rdfs:member"
    },
    "county": {
      "@container": "@set",
      "@type": "@id",
      "@id": "ncv:county"
    }
  },
  "@type": "rdf:Bag",
  "records": {
    "@id": {},
    "county": {
      "@embed": "@always",
      "@explicit": true,
      "skos:label": {}
    }
  }
}

(See the above on the playground).

Expected behavior

On the JSON-LD Playground, the above graph and frame produce these results, which are what I would expect: county nodes are always embedded and respecting the @explicit directive, but this does not affect the records level.

{
  "@context": {
    "@base": "http://n2t.net/ark:/39333/ncg/place/",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "ncv": "http://n2t.net/ark:/39333/ncg/vocab#",
    "nct": "http://n2t.net/ark:/39333/ncg/type#",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "records": {
      "@container": "@set",
      "@type": "@id",
      "@id": "rdfs:member"
    },
    "county": {
      "@container": "@set",
      "@type": "@id",
      "@id": "ncv:county"
    }
  },
  "@id": "../dataset",
  "@type": "rdf:Bag",
  "records": [
    {
      "@id": "NCG11248",
      "@type": "nct:County",
      "county": [],
      "skos:label": "Orange County",
      "skos:note": "Not to be confused with Orange County, CA"
    },
    {
      "@id": "NCG07554",
      "@type": "nct:Community",
      "county": [
        {
          "@id": "NCG11248",
          "@type": "nct:County",
          "skos:label": "Orange County"
        }
      ],
      "skos:label": "Ichley"
    },
    {
      "@id": "NCG03755",
      "@type": "nct:Mountain",
      "county": [
        {
          "@id": "NCG11248",
          "@type": "nct:County",
          "skos:label": "Orange County"
        }
      ],
      "skos:label": "Crawford Mountain"
    }
  ]
}

However, using Titanium I get the following results:

{
    "@id": "../dataset",
    "@type": "rdf:Bag",
    "records": [
        {
            "@id": "NCG03755",
            "@type": "nct:Mountain",
            "county": [
                {
                    "@id": "NCG11248",
                    "@type": "nct:County",
                    "skos:label": "Orange County"
                }
            ],
            "skos:label": "Crawford Mountain"
        },
        {
            "@id": "NCG07554",
            "@type": "nct:Community",
            "county": [
                {
                    "@id": "NCG11248",
                    "@type": "nct:County",
                    "skos:label": "Orange County"
                }
            ],
            "skos:label": "Ichley"
        },
        "NCG11248"
    ],
    "@context": {
        "@base": "http://n2t.net/ark:/39333/ncg/place/",
        "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
        "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
        "ncv": "http://n2t.net/ark:/39333/ncg/vocab#",
        "nct": "http://n2t.net/ark:/39333/ncg/type#",
        "skos": "http://www.w3.org/2004/02/skos/core#",
        "records": {
            "@container": "@set",
            "@type": "@id",
            "@id": "rdfs:member"
        },
        "county": {
            "@container": "@set",
            "@type": "@id",
            "@id": "ncv:county"
        }
    }
}

Note that the node for Orange County at the records level is just an @id reference.

Additional context

It seems to me that what is happening is that the records level is being processed with the default "@embed": "@once" directive, but that the embeddings at the county level are being counted as fulfilling that one embedding, so that at the records level Orange County is never embedded.

filip26 added a commit that referenced this issue Jun 1, 2021
@filip26
Copy link
Owner

filip26 commented Jun 1, 2021

Hi,
I've added the provided example as a new test and it seems to work well with 1.1.0-SNAPSHOT. Please, could you confirm that or provide more details? Thank you

filip26 added a commit that referenced this issue Jun 1, 2021
@rybesh
Copy link
Author

rybesh commented Jun 1, 2021

I tried the latest snapshot, and got the same results. It seems that it is an issue with how I am using Titanium.

I am starting with an N-Triples file, and processing it as follows:

    private static boolean frame(Path datasetPath, Path framePath) {
        try (BufferedReader datasetReader = Files.newBufferedReader(datasetPath);
                BufferedReader frameReader = Files.newBufferedReader(framePath)) {
            Document rdf = RdfDocument.of(MediaType.N_QUADS, datasetReader);
            Document jsonld = JsonDocument.of(JsonLd.fromRdf(rdf).get());
            Document jsonldFrame = JsonDocument.of(frameReader);
            JsonObject results = JsonLd.frame(jsonld, jsonldFrame).get();

            Pretty.createWriter(System.out).writeObject(results);

            return true;

        } catch (IOException | JsonLdError e) {
            return err("Failed to frame dataset using %s: %s", framePath, e);
        }
    }

When I do this I get the bad results described above, even with the latest snapshot. However, it works if I first use riot to convert my N-Triples file to a flattened JSON-LD graph, and then use the following code to do the processing:

    private static boolean frame(Path datasetPath, Path framePath) {
        try (BufferedReader datasetReader = Files.newBufferedReader(datasetPath);
                BufferedReader frameReader = Files.newBufferedReader(framePath)) {
            Document jsonld = JsonDocument.of(datasetReader);
            Document jsonldFrame = JsonDocument.of(frameReader);
            JsonObject results = JsonLd.frame(jsonld, jsonldFrame).get();

            Pretty.createWriter(System.out).writeObject(results);

            return true;

        } catch (IOException | JsonLdError e) {
            return err("Failed to frame dataset using %s: %s", framePath, e);
        }
    }

Am I not converting from N-Triples to JSON-LD correctly?

@filip26
Copy link
Owner

filip26 commented Jun 1, 2021

It looks like there is a bug in N-Triples parser or/and in fromRdf implementation. Can you share the N-Triples input file or check if the intermediary result after JsonLd.fromRdf(rdf).get() is correct?

@rybesh
Copy link
Author

rybesh commented Jun 1, 2021

Here's a zip with the N-Triples file and the JSON-LD that fromRdf produced. Looks like it is missing the "@graph" : [] around all the graph nodes.

Archive.zip

@filip26
Copy link
Owner

filip26 commented Jun 1, 2021

That's interesting. fromRdf uses @graph to denote named graphs. In that case @graph is paired with @id representing the graph IRI. Not sure why frame fails without it, yet.

Quick workaround, for now, could be to add @graph: [] manually.

@rybesh
Copy link
Author

rybesh commented Jun 1, 2021

I still get the same problematic output even if I manually add the @graph: [].

Graph:

{
  "@graph": [
    {
      "@id": "http://n2t.net/ark:/39333/ncg/dataset",
      "@type": [
        "http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag"
      ],
      "http://www.w3.org/2000/01/rdf-schema#member": [
        {
          "@id": "http://n2t.net/ark:/39333/ncg/place/NCG03755"
        },
        {
          "@id": "http://n2t.net/ark:/39333/ncg/place/NCG07554"
        },
        {
          "@id": "http://n2t.net/ark:/39333/ncg/place/NCG11248"
        }
      ]
    },
    {
      "@id": "http://n2t.net/ark:/39333/ncg/place/NCG03755",
      "http://n2t.net/ark:/39333/ncg/vocab#county": [
        {
          "@id": "http://n2t.net/ark:/39333/ncg/place/NCG11248"
        }
      ],
      "@type": [
        "http://n2t.net/ark:/39333/ncg/type#Mountain"
      ],
      "http://www.w3.org/2004/02/skos/core#label": [
        {
          "@value": "Crawford Mountain"
        }
      ]
    },
    {
      "@id": "http://n2t.net/ark:/39333/ncg/place/NCG07554",
      "http://n2t.net/ark:/39333/ncg/vocab#county": [
        {
          "@id": "http://n2t.net/ark:/39333/ncg/place/NCG11248"
        }
      ],
      "@type": [
        "http://n2t.net/ark:/39333/ncg/type#Community"
      ],
      "http://www.w3.org/2004/02/skos/core#label": [
        {
          "@value": "Ichley"
        }
      ]
    },
    {
      "@id": "http://n2t.net/ark:/39333/ncg/place/NCG11248",
      "@type": [
        "http://n2t.net/ark:/39333/ncg/type#County"
      ],
      "http://www.w3.org/2004/02/skos/core#label": [
        {
          "@value": "Orange County"
        }
      ],
      "http://www.w3.org/2004/02/skos/core#note": [
        {
          "@value": "Not to be confused with Orange County, CA"
        }
      ]
    }
  ]
}

Frame:

{
  "@context": {
    "@base": "http://n2t.net/ark:/39333/ncg/place/",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "ncv": "http://n2t.net/ark:/39333/ncg/vocab#",
    "nct": "http://n2t.net/ark:/39333/ncg/type#",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "records": {
      "@container": "@set",
      "@type": "@id",
      "@id": "rdfs:member"
    },
    "county": {
      "@container": "@set",
      "@type": "@id",
      "@id": "ncv:county"
    }
  },
  "@type": "rdf:Bag",
  "records": {
    "@id": {},
    "county": {
      "@embed": "@always",
      "@explicit": true,
      "skos:label": {}
    }
  }
}

Result:

{
  "@id": "../dataset",
  "@type": "rdf:Bag",
  "records": [
    {
      "@id": "NCG03755",
      "@type": "nct:Mountain",
      "county": [
        {
          "@id": "NCG11248",
          "@type": "nct:County",
          "skos:label": "Orange County"
        }
      ],
      "skos:label": "Crawford Mountain"
    },
    {
      "@id": "NCG07554",
      "@type": "nct:Community",
      "county": [
        {
          "@id": "NCG11248",
          "@type": "nct:County",
          "skos:label": "Orange County"
        }
      ],
      "skos:label": "Ichley"
    },
    "NCG11248"
  ],
  "@context": {
    "@base": "http://n2t.net/ark:/39333/ncg/place/",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "ncv": "http://n2t.net/ark:/39333/ncg/vocab#",
    "nct": "http://n2t.net/ark:/39333/ncg/type#",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "records": {
      "@container": "@set",
      "@type": "@id",
      "@id": "rdfs:member"
    },
    "county": {
      "@container": "@set",
      "@type": "@id",
      "@id": "ncv:county"
    }
  }
}

@filip26
Copy link
Owner

filip26 commented Jun 1, 2021

The missing @graph it not the issue but member order. if

        {
          "@id": "http://n2t.net/ark:/39333/ncg/place/NCG11248"
        }

is the first item in member array, as in your first example, then it works as expected. btw. json-ld-playground produces the same result as Titanium. see here

@rybesh
Copy link
Author

rybesh commented Jun 1, 2021

Ah, I see. That makes sense. Adding "@embed": "@always" under records seems to give me the results I want. Though what I really want is to have records as an id map; see #172.

@rybesh rybesh closed this as completed Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants