Base direction #165

gkellogg · 2019-10-07T00:14:48Z

gkellogg · 2019-10-07T23:09:36Z

I have all the algorithm updates except for to/from RDF (a bit of fromRdf is in).

Changes to term selection, in particular, but also value expansion and compaction are fairly major, and really need a thorough review. But, it is based on updates to my implementation, so I believe the core logic is sound.

rubensworks

Two minor comments on the tests:

Some more tests to check invalid language tags and directions may be valuable (separately, and in combination),
such as https://w3.org/ns/i18n#en-us_abc, https://w3.org/ns/i18n#a b_rtl and https://w3.org/ns/i18n#a b_abc.

Minor typo in several test manifests "datatiype" -> "datatype"

pchampin

I skimmed the parts concerning language maps, as I'm not familiar enough with this part of the algorithm to really appreciate the impact of the change.

In the "Value compaction" algorithm, I find steps 7-8-9 very verbose (and yet still incomplete, see my comments). May be I'll make a counter proposal at some point if I find a simpler way to express them.

But all in all, looks good. Thanks for this work.

index.html

pchampin · 2019-10-08T21:34:44Z

index.html

-          <var>active property</var>, return the value associated with the
+        <li class="changed">Otherwise, if <var>value</var> has both `@language` and `@direction <a>entries</a>,
+          and the `@language` <a>entry</a> matches either the <a>language mapping</a> of
+          <var>active property</var> or the <a>default language</a> of <var>active context</var>


shouldn't that be

it matches the language mapping of the active property OR

the language mapping of the active property is null and it matches the default language of the active context
?

Same for the the direction?

I think this was resolved in subsequent changes, can you verify?

index.html

gkellogg · 2019-10-08T21:55:20Z

@rubensworks Updated your typo. Can you suggest some specific tests? I can see https://w3.org/ns/i18n#en-us_abc, but https://w3.org/ns/i18n#a b_rtl and others aren't valid IRIs.

I could see combinations of invalid language tags and invalid base directions; not sure if we are checking language tags against BCP47 in any tests, actually. We'd also need to check when converting from RDF.

I could see the following combinations, used in both expansion tests, toRdf tests, and fromRdf tests where decoding both @langugage, and https://w3.org/ns/i18n# and rdf:language/rdf:direction:

language tags: a+b, a.b, en-1, etc.
base direction a, foo, ...

But, I'm not sure we should be putting any real requirements for implementations to completely validate BCP47 other than some obvious problems like illegal characters ('+', '!', '~', %). But, it's a slippery slope.

Co-Authored-By: Pierre-Antoine Champin <[email protected]>

gkellogg · 2019-10-08T23:10:14Z

In the "Value compaction" algorithm, I find steps 7-8-9 very verbose (and yet still incomplete, see my comments). May be I'll make a counter proposal at some point if I find a simpler way to express them.

Yes, it is pretty bad. I think this is a case where we go away from precise algorithm to describing the intentions:

Compact to a string, if any values of @language or @direction match those from the context, considering specific values in the term definition or defaults in the context and if the expanded value is not missing values for @language or @direction that should be there based on the term definition and defaults in the context.

Or, words to that effect.

gkellogg · 2019-10-09T00:23:52Z

In the "Value compaction" algorithm, I find steps 7-8-9 very verbose (and yet still incomplete, see my comments). May be I'll make a counter proposal at some point if I find a simpler way to express them.

I simplified value compaction, but there may be some corner cases, so I need to update my implementation to see if this handles everything, but it should read much better.

…ult language_) to just use _language mapping_ or _default language_, as there is no encoding.

index.html

dlongley

Wow! Lots of changes here. That being said, they seem ok to me (modulo some comments I left). I'm obviously saying this without having implemented.

Also, I'm not a fan of the compound literal approach and would probably like to see DB's implementations only support the i18n type approach if we support anything other than dropping @direction to start when going to RDF. We don't want the technical debt of the compound literal approach so may not support it unless there's clear uptake there. Not sure if anything additional here needs to be marked at risk.

index.html

gkellogg · 2019-10-09T17:53:48Z

Wow! Lots of changes here. That being said, they seem ok to me (modulo some comments I left). I'm obviously saying this without having implemented.

Also, I'm not a fan of the compound literal approach and would probably like to see DB's implementations only support the i18n type approach if we support anything other than dropping @direction to start when going to RDF. We don't want the technical debt of the compound literal approach so may not support it unless there's clear uptake there. Not sure if anything additional here needs to be marked at risk.

We should probably add at risk to both of these algorithms here, and in the syntax document, so they don't come back to bite us if there's not sufficient implementation support, or the community otherwise rejects them.

Looking at the implementation, the compound literal mechanism leverages off of the list support, but is certainly much more heavyweight. I wouldn't be sad if we settled to just use the i18n-datatype mechanism.

Co-Authored-By: Dave Longley <[email protected]>

index.html

Co-Authored-By: David I. Lehn <[email protected]>

…nd the presense of `@index`.

iherman · 2019-10-10T06:50:21Z

We should probably add at risk to both of these algorithms here, and in the syntax document, so they don't come back to bite us if there's not sufficient implementation support, or the community otherwise rejects them.

While I agree with this, I wonder whether, at this point, we should not consider these as non-normative, with a note that future versions of JSON-LD would settle on a normative version (or versions) depending on the community's reaction. I am not sure about it, but I still feel a bit uneasy about this WG suggesting normatively a solution in a space that goes beyond JSON-LD proper.

iherman · 2019-10-10T06:52:03Z

On the ...i18n#lt_dir option, although the specification itself is correct, we may want to emphasize (e.g., in a note) that the lt part may be empty, i.e., an ...i18n#_ltr is a correct datatype.

rubensworks · 2019-10-10T13:21:37Z

@gkellogg

Can you suggest some specific tests? I can see https://w3.org/ns/i18n#en-us_abc, but https://w3.org/ns/i18n#a b_rtl and others aren't valid IRIs.

Indeed, it was my goal to make them invalid IRIs :-) (they would definitely be applicable for the toRdf tests)
But you're right that if we go this route, then we should probably (also?) include valid IRIs, but invalid language tags, as you suggested below.

I could see combinations of invalid language tags and invalid base directions; not sure if we are checking language tags against BCP47 in any tests, actually. We'd also need to check when converting from RDF.

There is indeed one test that checks the validity of language tags.

I could see the following combinations, used in both expansion tests, toRdf tests, and fromRdf tests where decoding both @langugage, and https://w3.org/ns/i18n# and rdf:language/rdf:direction:

language tags: a+b, a.b, en-1, etc.
base direction a, foo, ...

👍

But, I'm not sure we should be putting any real requirements for implementations to completely validate BCP47 other than some obvious problems like illegal characters ('+', '!', '~', %). But, it's a slippery slope.

We do it for IRIs, so in that sense it also makes sense to do it for BCP47. I don't have a strong opinion on this though, so perhaps this can be left open for a future JSON-LD version.
But if we decide to do proper BCP47 validation, then I'd be happy to contribute some tests for this.

gkellogg · 2019-10-10T16:01:00Z

For the tests to work, all IRIs need to be parsable as N-Quads, so we can't effectively test truly illegal IRIs there.

The issue of whether we should validate language tags should be discussed on a call, as the current text only says that processors MAY warn on invalid language tags.

…hm steps in from/to RDF.

gkellogg · 2019-10-10T21:03:39Z

Indeed, it was my goal to make them invalid IRIs :-) (they would definitely be applicable for the toRdf tests)
But you're right that if we go this route, then we should probably (also?) include valid IRIs, but invalid language tags, as you suggested below.

As I noted, using invalid IRIs is problematic, as you can't parse an N-Quads document with an invalid IRI.

I could see combinations of invalid language tags and invalid base directions; not sure if we are checking language tags against BCP47 in any tests, actually. We'd also need to check when converting from RDF.

There is indeed one test that checks the validity of language tags.

I could see the following combinations, used in both expansion tests, toRdf tests, and fromRdf tests where decoding both @langugage, and https://w3.org/ns/i18n# and rdf:language/rdf:direction:

language tags: a+b, a.b, en-1, etc.
base direction a, foo, ...

👍

But, I'm not sure we should be putting any real requirements for implementations to completely validate BCP47 other than some obvious problems like illegal characters ('+', '!', '~', %). But, it's a slippery slope.

We do it for IRIs, so in that sense it also makes sense to do it for BCP47. I don't have a strong opinion on this though, so perhaps this can be left open for a future JSON-LD version.
But if we decide to do proper BCP47 validation, then I'd be happy to contribute some tests for this.

As it happens, [RDF Concepts[(https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal) does require that language tags be well formed according to BCP47:

... a non-empty language tag as defined by [BCP47]. The language tag *must be well-formed according to section 2.2.9 of [BCP47].

The other RDF syntaxes use something like the following to validate language tags:

[144s] LANGTAG                        ::= "@" [a-zA-Z]+ ( "-" [a-zA-Z0-9]+ )*

I could see us changing from a MAY validate language tags to MUST be valid either to [a-zA-Z]+ ( "-" [a-zA-Z0-9]+ )*, or the more strict ABNF from BCP47:

   obs-language-tag = primary-subtag *( "-" subtag )
   primary-subtag   = 1*8ALPHA
   subtag           = 1*8(ALPHA / DIGIT)

This should probably be done in context processing, expansion, and fromRDF algorithms to be consistent. The RDF Concepts regex is probably the one to use, as we're in that family.

I created issue #167 for this.

gkellogg · 2019-10-10T21:06:21Z

I'm going to merge this, as all review comments have been cleared, and the remaining issue of validating language tags has been raised in #167.

pchampin

I just reviewed the "Value compaction" allogrithm (I think this was the most signigicant change since my last review). It looks much better now 👍

index.html

pchampin · 2019-10-10T19:29:56Z

index.html

-            <li>Otherwise, if <var>number entries</var> is <code>1</code> and
-              the <a>type mapping</a> of <var>active property</var>
-              is set to <code>@vocab</code>, return the result of using the
+            <li>Otherwise, the <a>type mapping</a> of <var>active property</var>


Suggested change

<li>Otherwise, the <a>type mapping</a> of <var>active property</var>

<li>Otherwise, if the <a>type mapping</a> of <var>active property</var>

index.html

pchampin · 2019-10-11T06:45:44Z

index.html

          </ol>
        </li>
        <li>Otherwise, if <var>value</var> has an <code>@type</code> <a>entry</a> whose
          value matches the <a>type mapping</a> of <var>active property</var>,
-          return the value associated with the <code>@value</code> <a>entry</a>
+          set <var>result</var> to the value associated with the <code>@value</code> <a>entry</a>


Same as above

Suggested change

set <var>result</var> to the value associated with the <code>@value</code> <a>entry</a>

return the value associated with the <code>@value</code> <a>entry</a>

pchampin · 2019-10-11T06:53:39Z

index.html

+          </ol>
+        </li>
+        <li>If the value of `@language` in <var>item</var> exactly matches <var>language</var>,
+          if it is not `null`, or is not present, if it is `null`,


I think this would be clearer

Suggested change

if it is not `null`, or is not present, if it is `null`,

if it is not `null`, or is not present, if <var>language</var> is `null`,

pchampin · 2019-10-11T06:53:49Z

index.html

+        <li>If the value of `@language` in <var>item</var> exactly matches <var>language</var>,
+          if it is not `null`, or is not present, if it is `null`,
+          <span class="changed">and the value of `@direction` in <var>item</var> exactly matches <var>direction</var>,
+            if it is not `null`, or is not present, if it is `null`</span>:


I think this would be clearer

Suggested change

if it is not `null`, or is not present, if it is `null`</span>:

if it is not `null`, or is not present, if <var>direction</var> is `null`</span>:

pchampin · 2019-10-11T06:53:57Z

index.html

+              or <var>value</var> has no `@index` <a>entry</a>,
+              set <var>result</var> to the value associated with the `@value` <a>entry</a>.</li>
+          </ol>
+          return <var>value</var>.</li>


This seems to be a remains of a previous version

Suggested change

return <var>value</var>.</li>

It is now "return result".

gkellogg added 4 commits October 6, 2019 17:12

Tests for base direction.

1ea35e4

Updates to context processing for base direction.

920fc26

Test that value objects with @type don't include either @direction.

3a72c83

More algorithm updates.

c8ee705

gkellogg requested review from dlongley and pchampin and removed request for dlongley October 7, 2019 23:08

gkellogg requested review from dlongley, davidlehn and rubensworks October 8, 2019 05:59

rubensworks suggested changes Oct 8, 2019

View reviewed changes

Typo in manifests.

46bb442

pchampin approved these changes Oct 8, 2019

View reviewed changes

gkellogg and others added 4 commits October 8, 2019 14:57

Fix HTML markup issues.

ebeda8b

Add fromRdf processing for compound-literal.

07a96b3

Add direction processing to the toRdf algorithm.

66293ae

Apply suggestions from @pchampin code review

c625875

Co-Authored-By: Pierre-Antoine Champin <[email protected]>

Reworked value compaction. (May still need some tweaks).

86bf8e1

gkellogg requested a review from pchampin October 9, 2019 00:23

gkellogg marked this pull request as ready for review October 9, 2019 00:24

Simplify the use of "_language tag_ from _language mapping_ (or _defa…

2b934ab

…ult language_) to just use _language mapping_ or _default language_, as there is no encoding.

dlongley reviewed Oct 9, 2019

View reviewed changes

index.html Show resolved Hide resolved

Simplify Value Expansion.

bb577c8

dlongley reviewed Oct 9, 2019

View reviewed changes

index.html Outdated Show resolved Hide resolved

index.html Outdated Show resolved Hide resolved

index.html Outdated Show resolved Hide resolved

index.html Outdated Show resolved Hide resolved

index.html Show resolved Hide resolved

index.html Outdated Show resolved Hide resolved

Apply suggestions from @dlongley code review

9463930

Co-Authored-By: Dave Longley <[email protected]>

davidlehn reviewed Oct 9, 2019

View reviewed changes

index.html Show resolved Hide resolved

davidlehn reviewed Oct 9, 2019

View reviewed changes

index.html Outdated Show resolved Hide resolved

dlongley reviewed Oct 9, 2019

View reviewed changes

index.html Outdated Show resolved Hide resolved

gkellogg and others added 4 commits October 9, 2019 14:42

Update index.html

3b6ed65

Co-Authored-By: David I. Lehn <[email protected]>

Another update to value compaction to properly manage data indexing a…

c1fe11b

…nd the presense of `@index`.

Some other fixes suggested by @davidlehn.

00f769d

Update changes to note addition of @direction.

6fd79c7

gkellogg added 2 commits October 10, 2019 13:25

Add at-risk issue for both i18n-datatype and compound-literal algorit…

b63dd8b

…hm steps in from/to RDF.

Add note on legitimacy of https://w3.org/ns/i18n#_ltr`.

b39ecba

gkellogg merged commit 00c32db into master Oct 10, 2019

gkellogg deleted the base-direction branch October 10, 2019 21:06

pchampin approved these changes Oct 11, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Base direction #165

Base direction #165

gkellogg commented Oct 7, 2019 •

edited by pr-preview bot

Loading

gkellogg commented Oct 7, 2019

rubensworks left a comment

pchampin left a comment

pchampin Oct 8, 2019

gkellogg Oct 9, 2019

gkellogg commented Oct 8, 2019

gkellogg commented Oct 8, 2019

gkellogg commented Oct 9, 2019

dlongley left a comment

gkellogg commented Oct 9, 2019

iherman commented Oct 10, 2019

iherman commented Oct 10, 2019

rubensworks commented Oct 10, 2019

gkellogg commented Oct 10, 2019

gkellogg commented Oct 10, 2019

gkellogg commented Oct 10, 2019

pchampin left a comment

pchampin Oct 10, 2019

pchampin Oct 11, 2019

pchampin Oct 11, 2019

pchampin Oct 11, 2019

pchampin Oct 11, 2019

gkellogg Oct 12, 2019

	<li>Otherwise, the <a>type mapping</a> of <var>active property</var>
	<li>Otherwise, if the <a>type mapping</a> of <var>active property</var>

	set <var>result</var> to the value associated with the <code>@value</code> <a>entry</a>
	return the value associated with the <code>@value</code> <a>entry</a>

	if it is not `null`, or is not present, if it is `null`,
	if it is not `null`, or is not present, if <var>language</var> is `null`,

Base direction #165

Base direction #165

Conversation

gkellogg commented Oct 7, 2019 • edited by pr-preview bot Loading

gkellogg commented Oct 7, 2019

rubensworks left a comment

Choose a reason for hiding this comment

pchampin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gkellogg commented Oct 8, 2019

gkellogg commented Oct 8, 2019

gkellogg commented Oct 9, 2019

dlongley left a comment

Choose a reason for hiding this comment

gkellogg commented Oct 9, 2019

iherman commented Oct 10, 2019

iherman commented Oct 10, 2019

rubensworks commented Oct 10, 2019

gkellogg commented Oct 10, 2019

gkellogg commented Oct 10, 2019

gkellogg commented Oct 10, 2019

pchampin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gkellogg commented Oct 7, 2019 •

edited by pr-preview bot

Loading