-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support JSON values that aren’t mapped #4
Comments
👍 This is a better (IMO) solution for native JSON values such as GeoJSON than requiring every community to map all of their constructs into -LD. To quote (with slight edits) the example from the original issue:
This seems very sensible, and fits with our charter. We can later make |
I would prefer a more LD friendly solution for GeoJSON. #7 ? |
@akuckartz I didn't mean to imply that GeoJSON-LD was a bad thing to do, just that if the requirement is "support native JSON data structures in the JSON-LD context", then GeoJSON could be managed that way without then layering on GeoJSON-LD. GeoJSON-LD is great ... but if you don't need to interact with the -LD part of it, just record the JSON structure, there's overhead that could be minimized. There's a separate issue for the list of lists feature beyond #7 that was already accepted to be part of 1.1. #7 would additionally let the semantics of the list of lists be expressed. |
The key is the expanded form; my thought was that the previous example might expand to something like the following: [{
"@id": "http://example/foo",
"http://example/json-value": [{
"@value": {"native": "json"},
"@type": "@json"
}]
}] Regarding #7, this is not in conflict with a potentially more semweb-y mapping for GeoJSON, but there are other reasons why you might want to preserve raw JSON within JSON-LD. When turned into RDF, we would need a datatype to describe the value, so that you would get something like the following: @base <http://example/foo> .
@prefix jsonld: <https://www.w3.org/ns/json-ld#> .
<foo> <json-value> '{"native": "json"}'^^jsonld:json . Where the JSON is normalized to use minimal whitespace. |
I think defining a jsonld:json datatype woukd make a lot if sense at this day and age... and would offer a clean solution. |
Will need to note that the whole feature is somewhat implementation dependent. Native JSON serialization/deserialization issues may some effect on key ordering, float representation, etc. |
Should perhaps be |
WG resolved to add a @JSON keyword, mapped to jsonld:JSON to identify the JSON data type. |
I'm concerned this opens a Pandora's box...or maybe several. Sadly, I wasn't here for the call and had overlooked this issue earlier, so I fear I'm only just now raising these concerns... We're (rather passively) introducing a namespace specific to JSON-LD: We're inviting developers to avoid/ignore the graph model JSON-LD encodes: {
"@context": {
"data": {"@type": "@json"}
},
"data": {"everything": "imaginable"}
} I fear providing this as a "solution for native JSON values such as GeoJSON" sends the wrong message...and it begins to invalidate the reason to have JSON-LD at all (see the example above). Are we also planning to do this for YAML? Because the use cases would be identical... |
Having implemented the RDF canonicalization spec with a minor headache, this sounds like a full on migraine. yo dawg I heard you like canonicalization so I put a tree data serialization canonicalization algorithm in your graph data serialization canonicalization algorithm so you can normalize while you normalize |
@BigBlueHat I appreciate (and to some extent share) that concern, but I wonder if there's a historical analogy: I've not seen the kind of problem you are describing using XML literals within RDF/XML. That may not be a valid analogy, but it's a bit suggestive... |
Re YAML, I don't think we would do that, because (a) no one has asked for it and (b) YAML is a non-normative deliverable of how the patterns of JSON-LD could be used in YAML to accomplish the same ends. The charter says: "JSON-LD 1.1 examples specified in YAML" not a normative YAML-LD Rec. We would be introducing a namespace, yes. We could also (as discussed on the call) add the data type to the RDF namespace, but we at least would need to document it. The consensus was that the creation of a new namespace was less work than putting it into an existing one, and a future RDF WG could take it over down the line. I agree with @ajs6f about the use of XML literals in RDF/XML. Yes, you can create pointless RDF that simply wraps a single literal in XML or JSON ... but why would you bother to do that? It seems like an enormous waste of your time other than to meet some badly worded RFP. |
As @ajs6f points out, other RDF syntaxes that leverage languages have a similar mechanism for including raw XML or HTML, this is really no different. For RDF canonicalization, such values would be treated just as other datatyped literals. Part of the RDF serialization aspects should include whitespace normalization, which is fairly standard in JSON, so I don't really appreciate why things such as RDF Dataset Normalization and signatures would be at any disadvantage. |
@BigBlueHat worries about introducing a new namespace:
In fact, this namespace already exists for URIs such as However, we don't need to use this namespace, and @iherman suggested that we could probably use the RDF namespace I agree that this no longer serves for GeoJSON, and we should consider some other example, but such examples doubtless exist, which is why this is a compelling feature. |
I guess we can all agree that this is (a) technically doable (b) it may require normalization of the literal (at least optionally) and (c) it is not fundamentally different from the XML and HTML datatypes. (E.g., if we do have a standard for RDF canonicalization at some point, that standard must address the issue of literals and their normalization (or not), and the issues raised by @cwebber are also genuine problems for HTML literals.) However. I guess we are back to our design principles set out at the beginning of the WG's life. We should not do this just because we can; we should have proper use cases, see relevant section. I cannot judge whether GeoJSON is a use case or not. |
@azaroth42 is there a github issue for the list of lists support? If not, may I create one? |
It isn't that simple. Whitespace is not the only issue. We will probably have to support something like this json canonicalization spec or something. That's a lot of extra work. There's also a huge risk that people will open this loophole much, much wider than is anticipated, marking giant swaths of content as json-only. Yeah, I guess that's true for XML too, but to be honest no sane person could operate on XML-RDF as if it were real XML and have things survive... it was an RDF serialization format and little more. Here people are actually working with json-ld as if it were normal json and getting reasonable RDF interop. There are pain points occasionally, and we should try to remedy those, but I think this is opening an escape hatch that a good number of people will jump straight through. Careful about rubbing this lamp... I think fulfilling this wish will have more side effects than anticipated and may undo a lot of the goals of json-ld. -1 from me. |
I share the same concerns as @cwebber and @BigBlueHat. |
Re canonicalization (or even just whitespace normalization) ... can someone describe the issue and the risk here? If one implementation serializes to a string |
@azaroth42 Those would end up being two different signatures with linked data signatures. Without canonicalizing the json exactly the same way every time, LDS will break. |
@cwebber said:
Good point, as by the time we see the data, its in a parsed form, and we can't depend on specific representation of numbers, for example. At this point, I'd say that the work should be put on hold, certainly pending an important use case. |
We can defer, but I would like to note that canonicalization and LDS are explicitly out of scope of the WG, per the charter: https://www.w3.org/2018/03/jsonld-wg-charter.html |
@cyberphone, can you comment on the status of @iherman part of testing requires an RDF transform and using dataset isomorphism. At that point, the precise lexical representation of JSON literals becomes important. Certainly, this could be left out of the spec, and used in test-suite instructions, but for many reasons, setting on a canonical form for JSON literals is going to be important, if we can overcome the normative citation issues. |
My Ruby version for JSON canonicalization: https://github.com/dryruby/json-canonicalization. |
@gkellogg It is great to see a sixth incarnation of the proposal! Regarding progress the technical issues have (AFAICT...) been properly identified; the problem is rather that a bunch of people still consider canonicalization as pure stupidity. OTOH, it seems that none of the current Open Banking APIs has bought into the Base64Url-concept either. FWIW, I will do a short presentation |
This issue was discussed in a meeting.
View the transcriptJSON datatypeRob Sanderson: link: #4 Rob Sanderson: PR: w3c/json-ld-api#72 Rob Sanderson: we also have discussed the JSON datatype on github … Gregg, you’ve been the most involved (as always) … could you summarize? Gregg Kellogg: the issue comes down to representation … if you are going to describe both the lexical and value space … somewhat like HTML … the lexical space cannon be guaranteed … the JSON literal quality is lost when its turned into a native representation … you loose the original key ordering, key escaping, and lexical numerical representations … so it seems we will need to canonicalize … which has been referenced in the issue … it’s sadly not as close to done as I’d hoped … and we can’t count on it being final in time … so, do we care if two implementations use the same canonicalization … so we have done some things about do we use Integer or Doubles for numbers … so when you’d turn the JSON literal into RDF (in the toRDF space), we do need to say something about that at least … and the elimination of whitespace … and the ordering of keys … I think that can be done … there’s a lot of detail in that, but we should be able to reference ECMAScript for this … or we could do it ourselves Rob Sanderson: last time we talked about the canonicalization issue … we also talked about HTML being not easily canonicalizable Gregg Kellogg: HTML is a little different … they will preserve order, and whitespace … so you do have the opportunity return to that result Ivan Herman: well, attribute order and things are not covered … this would be a problem if you were to attempt to sign an HTML document Gregg Kellogg: if we weren’t in an era when signatures weren’t as important as they are now, then maybe we wouldn’t need to care about this so much Rob Sanderson: so, is there a JSON-LD document that could include a JSON “native” data type that also needs to be signed … so if the only use case is to import GeoJSON … do we need to worry Ivan Herman: I have spent time on this issue with others … aside from the canonicalization problem … if we do make a native JSON type, we will have to put it into some namespace–rdf: or jsonld: Rob Sanderson: +1 to RDF namespace Ivan Herman: if we do that, we’ll have to write the SWIG mailing list, to announce the new datatype, etc. … we can do this as part of our document … the other problem is … I did put a reference in the issue for the rules we have to follow when we point to something normatively … my first reading is that unfortunately, this JSON canonicalization specification cannot be referred to normatively … the second problem is bringing our own canonicalization into our document … if we do that, I can safely say the Director would say no to that … so, we can’t just take an IETF spec and put it into a W3C spec … all of these are admin problems … But I am still not convinced that we need the canonicalization as a normative part of our spec … we could say that someone else may do this and reference forthcoming work … but when the issue is that we have a JSON portion we want to store in RDF … we can state that the only expectation is that [the same processor will produce the same output] … none of the arguments that I heard is that canonicalization needs to be normative Pierre-Antoine Champin: http://tinyurl.com/y2gmzxf8 Pierre-Antoine Champin: I was wondering about this example … there’s an Integer in the non-canonical form … would that be canonicalized or not? Gregg Kellogg: yes, that would be canonicalized … I don’t know any processors that would properly serialize that with a leading zero … if you’re going to the internal representation … it is the number 42 … some might do 42.0 … or 42E+0 … that would be fine, but I don’t think most JSON serializers would do that Pierre-Antoine Champin: for the moment, we know how to sign this thing Dave Longley: I think this falls into the same category as HTML … it’s a string in the JSON; it’s not native HTML … or a native number in the example’s case … if we’re storing stuff in a string, then store it as a string … but people want a native JSON object in their JSON Pierre-Antoine Champin: but if you remove the leading 0 you don’t get the same signature … so I’m assuming that the signature is dealing with the order or absence of order in the object when signed … so if the object was a native JSON object, then it would already benefit … and regardless we already have this problem with other string-expressed literals Rob Sanderson: if you instead make it value 42.0 … since no one really serializes as 042 … whatever you change here will change the signature … even though it will canonicalize as something different Dave Longley: I disagree Rob Sanderson: what do you disagree with? Ivan Herman: I think in these examples, the current JSON-LD specification doesn’t say anything about what you put in strings … we don’t suggest any sort of mini-canonicalization for things like this … having built-in canonicalization for the native JSON representation … would be a departure from what we’ve done previously Dave Longley: my response to all that is that we have very consistent rules about moving non-string data into strings … so we do have those sorts of specifications … from a native JSON value into a string … this same thing would exist for native JSON objects … for things that come in via a string, those will stay as whatever that string is … so strings have no issue … so if you take pchampin’s example, and change it to a real number: 42 Gregg Kellogg: 42, 42.0, 42.0E0, 4.2E+1 are all the same number Dave Longley: and if you put that in the playground, check the nquads tab, you’ll find the same number Ivan Herman: yep I acknowledge that Rob Sanderson: maybe then it’s the playground which is at fault … I put in several examples, and the signature changes for all of these different 42’s as an integer Dave Longley: you’re looking at the RSA signature, so you’ll see it change constantly … because that injects random data … what you need to look at is the N-Quads or normalized tabs … the data there stays the same Gregg Kellogg: this is in the data round tripping section Gregg Kellogg: so, imo, if we create a datatype for JSON … before there is a canonicalization for it … then we’re in danger of doing things too early … ultimately we need to deal with a canonicalized JSON Pierre-Antoine Champin: +1 Gregg Kellogg: so the best thing we can do right now is nothing … and defer this until there is a canonicalized form … otherwise whitespace, object ordering, etc are all variable … and the literals really won’t be worth doing any lexical representation is important … better not to do anything until a canonicalization spec exists Ivan Herman: my take would be milder … the GeoJSON example doesn’t care about canonicalization Rob Sanderson: +1 to ivan Ivan Herman: with the canonicalization things differed … and state that this feature is not recommended … so we differ it, and if/when the canonicalization becomes standard or whatever, then we at that point suggest that that spec gets used Rob Sanderson: it would be better to have a JSON datatype and state that later we’ll do canonicalization Dave Longley: let’s provide rules for how to produce the JSON string that match the draft – but that you can do something else and be very clear it’s preferred that everyone do the same thing Rob Sanderson: so we should start with JSON datatypes, and just suggest that you can’t sign these Jeff Mixter: +1 to ivan and azaroth Gregg Kellogg: if we don’t do canonicalization now, we don’t seem to be prevented from doing it later … if we end up as a living spec, then we could do it that way … and we could also suggest that for testing purposes it is always canonicalized Rob Sanderson: a warning or a note? Proposed resolution: Move forwards with a JSON native data type, with a warning that it cannot be canonicalized (Rob Sanderson) Rob Sanderson: I’d suggest a warning Gregg Kellogg: +1 Jeff Mixter: +1 Ivan Herman: +1 Rob Sanderson: +1 Simon Steyskal: +1 Pierre-Antoine Champin: +1 Tim Cole: +1 Dave Longley: +0 Benjamin Young: +0 still have concerns about eager misuse David I. Lehn: +0.5 Jeff Mixter: I echo bigbluehat concerns but I also have very valid reasons to add JSON to RDF data. Dave Longley: +1 to everything Benjamin is saying … but that we should really also have JSON literals … but they should also all be converted to the same strings in processors :) David Newbury: +1 Resolution #3: Move forwards with a JSON native data type, with a warning that it cannot be canonicalized Dave Longley: JSON literals can be an escape hatch but ONLY an escape hatch. |
Agree done, closing :) |
@value
.[]
and{}
in framingOriginal issue is Support JSON values that aren't mapped #333
The text was updated successfully, but these errors were encountered: