Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No spaces #171

Merged
merged 5 commits into from
Mar 7, 2017
Merged

No spaces #171

merged 5 commits into from
Mar 7, 2017

Conversation

timbl
Copy link
Member

@timbl timbl commented Feb 25, 2017

URIs cannot contain spaces (unless they are encoded as %20)
This has always been the case, but recently we have made it a run-time error to try to generate a named node with them, so various bugs have been surfaced. This PR

  • fixes two places where spaces were ending in URIs, eg from link headers

  • upgrade the code in serializer.js to pass standard #technicalcredit

@@ -516,7 +516,8 @@ var Fetcher = function Fetcher (store, timeout, async) {
} else {
// See https://www.iana.org/assignments/link-relations/link-relations.xml
// Alas not yet in RDF yet for each predicate
predicate = kb.sym(Uri.join(rel, 'http://www.iana.org/assignments/link-relations/'))
/// encode space in e.g. rel="shortcut icon"
predicate = kb.sym(Uri.join(encodeURIComponent(rel), 'http://www.iana.org/assignments/link-relations/'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at http://www.iana.org/assignments/link-relations/ and didn't see any relations with spaces in them. Why is this necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) when theer could be and you don't want your code to crash and
b) See the comment ... e.g. rel="shortcut icon" was found in real life

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm for a), but isn't b) misusing the iana link relations vocabulary? Furthermore http://www.iana.org/assignments/link-relations/shortcut%20icon dereferences to a 404.

Copy link
Contributor

@dan-f dan-f Feb 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that NamedNodes should url encode spaces in their values before serialization. But code that constructs named nodes should be providing URIs which resolve to real data.

Separately, I think we should have NamedNode.toString do the url encoding rather than have callers pass in already-encoded urls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The way things evolve in things like HTML tags and link relations and HTTP headers is always a mixture of standards first and code first.
  2. yes, and IANA should by linked data principles should provide RDF at the end of the link. This is a longstanding issue over years. (That's why the classes derived from mime types I code to be in W3c space.
    A to-do is to populate that space in fact with triples the triples to go int it, they need t e put in files )
  3. Another thing we cna do is make a mirror of what we sould tlike they cto have, and have a list of things whih are not maintained well with pointers to where pothers mainatin a better copy -- just for core things like tis

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the background 😄

What do you think about having the url encoding done during NamedNode serialization rather than during the construction of NamedNodes?

Copy link
Member Author

@timbl timbl Feb 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because any string which is not a URL is neither here nor there, not a well-defined thing. You can only compare URLs. Things which random code in places converts into URLs, like fileames, people names, titles of documents, HRRP headers, link relations, whatever.... there is no standard for them, they only have well-defined properties in their own defined spaces like file system etc etc.
When they all come together as URIs they are part of a single universal namespace.

@@ -34,20 +34,20 @@ var Serializer = function () {
/* pass */
}

__Serializer.prototype.setBase = function (base) { this.base = base ; return this}
__Serializer.prototype.setBase = function (base) { this.base = base; return this }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this code generated by a transpiler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to remember whether it was python or coffee - -maybe this was when coffee was the thing... .. as I don't see any pyjslib calls.

p = 'n' // Otherwise the loop below may never termimnate
}
for (var i = 0;; i++) if (canUse(p.slice(0, 3) + i)) return pok
for (var j = 0; ; j++) if (canUse(p.slice(0, 3) + j)) return pok
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop is pretty opaque

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Yes, I re-coded it, as (canUse(p.slice(0, 3) + i)) return p.slice(0, 3) + i) etc

@@ -197,8 +197,8 @@ var Serializer = function () {
// @param force, "we know this is a root, do it anyway. It isn't a loop."
function dummyObjectTree(obj, subjects, rootsHash, force) {
// dump('dummyObjectTree('+obj+'...)\n')
if (obj.termType == 'BlankNode' && (subjects[sz.toStr(obj)] &&
(force || (rootsHash[obj.toNT()] == undefined )))) {// and there are statements
if (obj.termType === 'BlankNode' && (subjects[sz.toStr(obj)] &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just dead code? If so, it should be deleted. If not do we have a ticket to do something with it?

@dmitrizagidulin
Copy link
Contributor

👍 Nice work on the serializer!

Tim Berners-Lee added 2 commits February 27, 2017 12:18
…ndard processing). Add a test t10 with more than a few triples in one file! many more tests needed
@dan-f
Copy link
Contributor

dan-f commented Mar 7, 2017

👍 Thanks @timbl for reminding me that it shouldn't be possible to create a NamedNode from an invalid URI in the first place.

@timbl timbl merged commit aafbc6b into master Mar 7, 2017
@deiu deiu removed the in progress label Mar 7, 2017
@timbl timbl deleted the no-spaces branch March 7, 2017 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants