Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is default value if @base is not defined in the metadata description #91

Closed
6a6d74 opened this issue Dec 4, 2014 · 19 comments
Closed

Comments

@6a6d74
Copy link
Contributor

6a6d74 commented Dec 4, 2014

The metadata vocab doc section 3.2 Top Level Properties talks about base URL - defined by @base.

What is the behaviour expected when @base is not defined? Is the base URL anticipated to be the location from which that metadata description was retrieved? (I think this is the case - but I don't see this in the doc!)

Given the ambiguity, I think that this is important the the metadata doc is amended to clarify this 'undefined' case.

@iherman
Copy link
Member

iherman commented Dec 4, 2014

On 04 Dec 2014, at 11:58 , Jeremy Tandy [email protected] wrote:

The metadata vocab doc section 3.2 Top Level Properties talks about base URL - defined by @base.

What is the behaviour expected when @base is not defined? Is the base URL anticipated to be the location from which that metadata description was retrieved? (I think this is the case - but I don't see this in the doc!)

Not sure. Wouldn't it be more logical to use the reference to the CVS file as base?

Ivan

Given the ambiguity, I think that this is important the the metadata doc is amended to clarify this 'undefined' case.


Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 4, 2014

@iherman: for core tabular data using the location of the CSV file as the base URL is our only option. Pretty much the only place that the URLs manifest in a mapping of core tabular data is for the predicates used to relate each row object to its property values (and even then only in the RDF mapping).

However, when a metadata description is supplied, we might have the situation where two independent parties are publishing metadata about the same CSV file. Surely in this case, the predicates used to relate the row object to its property values should reflect the originator of the metadata? (assuming that they didn't already override the name property by stating a predicateURL). It's conceivable that each metadata publisher could have interpreted the columns in a different way - which means that we would need to give them different identifiers.

@iherman
Copy link
Member

iherman commented Dec 4, 2014

On 04 Dec 2014, at 15:00 , Jeremy Tandy [email protected] wrote:

@iherman: for core tabular data using the location of the CSV file as the base URL is our only option. Pretty much the only place that the URLs manifest in a mapping of core tabular data is for the predicates used to relate each row object to its property values (and even then only in the RDF mapping).

However, when a metadata description is supplied, we might have the situation where two independent parties are publishing metadata about the same CSV file. Surely in this case, the predicates used to relate the row object to its property values should reflect the originator of the metadata? (assuming that they didn't already override the name property by stating a predicateURL). It's conceivable that each metadata publisher could have interpreted the columns in a different way - which means that we would need to give them different identifiers.

I am still not convinced, and the fact that (as you say) for a core tabular data we do not have a choice actually reinforces my view (due to consistency).

We give quite a panoply of tools to generate URI-s for the conversion result by using the template mechanism in the metadata. Ie, the metadata author can adapt his/her output the way he/she wants.

I may be missing something, though.

Ivan


Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@gkellogg
Copy link
Member

gkellogg commented Dec 4, 2014

I am a bit concerned about such explicit references to @base. In a JSON-LD context, @base is a way to override the base set by the document location. It would be better to say that the base URI of a CSV is that of it's metadata, considering JSON-LD parsing, this would allow for a variety of locations where @base could be set, or not set. Note that, if there is no metadata document, this would end up being the location of the CSV itself. Same could be said of @language. This also indicates that @base and @language can be used as properties of the metadata document itself, and not as part of it's context.

If we try to stick to the form of JSON-LD, but ignore the algorithms, we're likely to invite such miss-understanding. Perhaps both should have something like the following:

The base URL against which other URLs within the description are resolved is established in [[JSON-LD-API]] so that URL expansion is consistent with Section 6.3 IRI Expansion in [[JSON-LD-API]].

Similar wording for @language could be taken by referencing Section 7.2 Value Expansion.

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 4, 2014

@gkellogg: I agree with the sentiment - and I have been trying to use the term base URL in favour of @base. So ...

  • the base URL of an annotated tabular data object is the that of it's metadata
  • the base URL of a core tabular data object (where there is no metadata) is that of the CSV file

Where the mapping is to plain-old-JSON relative URIs will need to be expanded out - because there is no concept of @base where as the RDF mapping can indulge in the extra sophistication it provides.

The wording you supply is a little bit dense for me; I anticipate readers of our spec being none-the-wiser after chewing through that :-)

AFAICT, @language doesn't actually propagate to the tabular data itself ...

  • properties defined in the metadata may have language defined - and where those properties are included in the output graph of a mapping, the language of those properties will come with them.
  • the inherited property language may be set in the metadata and this applies to the cell values of the tabular data.

@gkellogg
Copy link
Member

gkellogg commented Dec 4, 2014

Setting @language in the context means that properties having a string value, which are not otherwise typed, have that language applied. The extension might be that extracted cell values could be given the same default language treatment, but it may just be better to ignore this, and rely on an explicit csvm:language (or dc:language) definition made at any level within the metadata document. This way we don't conflate the processing algorithms. Currently, this property is defined on Column, but it could also apply to TableGroup and Table and possibly Schema.

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 4, 2014

As an Inherited Property, language may be defined in table group, table, schema or column descriptions (as you say) - but it is applied to the data in a given cell.

All natural language direct annotation properties (including Common Properties - those from external vocabularies) defined in a metadata document can benefit from the multi-lingual goodness of JSON-LD: @language settings and language maps (e.g. those objects like)

"occupation":
  {
    "ja": "忍者",
    "en": "Ninja",
    "cs": "Nindža"
  }

(from example 34 in JSON-LD Syntax section 6.5 String Internationalization)

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 11, 2014

I've thought about this a bit more; I think it makes sense to have the base URL be that of the CSV file. My rationale is that the URLs generated in the document, say, from a row-level urlTemplate talk about the data.

@gkellogg
Copy link
Member

I agree, establishing the base of the metadata file can be tricky, particularly given merge semantics. Also, it would be odd for several CSV files sharing a metadata file to use the same base URI.

JeniT pushed a commit that referenced this issue Dec 14, 2014
@JeniT
Copy link

JeniT commented Dec 14, 2014

I've clarified that the base URL for the metadata document in the absence of the @base property in the @context is the location of the metadata document itself.

It's good practice for URLs to be taken as relative to the location where they're found. For example, the URLs of any imported metadata documents should be relative to the original metadata document or it will be incredibly confusing.

It might be that there are particular properties that should be interpreted as URLs relative to the @id of a table description, but I think we need to define them explicitly as doing so. @6a6d74, which properties do you think they are?

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 16, 2014

@JeniT: the main uses of base URL in the mapping are:

  • when converting name to a predicate in RDF mapping
  • when resolving URI template expansions that are relative URLs

There may be more; I can cross-ref the mapping docs at some point ... but that said, I'm fairly happy with the mapping docs asserting that the base URL for the mapping is the CSV file location .

@gkellogg
Copy link
Member

When processing the metadata, certainly relative URLs should be considered to the metadata document. This should not be affected by merging, as merging happens into the selected metadata document (user-specified metadata and data from the CSV file itself).

However, when evaluating templates which are relative URLs, might it not be the case that the resulting URL should be relative the the CSV file?

@JeniT
Copy link

JeniT commented Feb 4, 2015

I think in many cases the propertyUrl will want to be resolved relative to the location of the document that it's located in (which might be a separate schema file that is shared across multiple files).

I suggest that to handle the requirement for URI templates to sometimes be URLs that are relative to the processed CSV file, we have another special variable, eg _tableUrl, which is the URL of the table (from the url property as it is now). Then you can have things like {_tableUrl}#row={_row} if you want to generate URLs like that to identify rows.

@iherman
Copy link
Member

iherman commented Feb 4, 2015

Works for me.

Ivan

On 04 Feb 2015, at 15:21 , Jeni Tennison [email protected] wrote:

I think in many cases the propertyUrl will want to be resolved relative to the location of the document that it's located in (which might be a separate schema file that is shared across multiple files).

I suggest that to handle the requirement for URI templates to sometimes be URLs that are relative to the processed CSV file, we have another special variable, eg _tableUrl, which is the URL of the table (from the url property as it is now). Then you can have things like {_tableUrl}#row={_row} if you want to generate URLs like that to identify rows.


Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@gkellogg
Copy link
Member

gkellogg commented Feb 4, 2015

In some of the work I've been doing, the default identifier for the Table metadata comes from url with #table. Specifying an @id for the Table overrides this. This might just be resolved by establishing that default and using the Table @id to resolve relative URLs. This would also imply that URI templates be localized when merging by expanding PNames and joining to the table @id during merge. This is similar to how @language is fixed when merging.

@JeniT
Copy link

JeniT commented Feb 4, 2015

Relates to #106

@JeniT
Copy link

JeniT commented Feb 13, 2015

Resolved at Feb F2F. Link properties are resolved against the base url, maybe the @base from the context, or it may be the location of the metadata file, during normalization of the metadata file, and prior to merge. URL template properties are expanded (variable references replaced by values from a particular row) into a URL, which is then expanded to resolve prefixes, and against a base URL which is the absolute, resolved, table url.

@gkellogg
Copy link
Member

Looking at the text, I believe this action can be closed. We define base URL as either the value of @base or the metadata location. As link properties are resolved prior to merge, there is no need to define the metadata location of merged metadata. @JeniT, do you concur?

@gkellogg gkellogg assigned JeniT and unassigned gkellogg Feb 18, 2015
@JeniT
Copy link

JeniT commented Feb 18, 2015

Yes, this is done.

@JeniT JeniT closed this as completed Feb 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants