Skip to content
This repository has been archived by the owner on Oct 28, 2024. It is now read-only.

Commit

Permalink
Merge branch 'main' into 292/clarify-serialization
Browse files Browse the repository at this point in the history
  • Loading branch information
roll authored Mar 14, 2024
2 parents b9e5476 + bcbb2f3 commit 0834e4a
Show file tree
Hide file tree
Showing 3 changed files with 94 additions and 102 deletions.
12 changes: 9 additions & 3 deletions content/docs/specifications/data-package.md
Original file line number Diff line number Diff line change
Expand Up @@ -261,9 +261,9 @@ The raw sources for this data package. It `MUST` be an array of Source objects.

##### `contributors`

The people or organizations who contributed to this Data Package. It `MUST` be an array. Each entry is a Contributor and `MUST` be an `object`. A Contributor `MUST` have at least one property. A Contributor is RECOMMENDED to have `title` property and MAY contain `path`, `email`, `role`, and `organization` properties. An example of the object structure is as follows:
The people or organizations who contributed to this Data Package. It `MUST` be an array. Each entry is a Contributor and `MUST` be an `object`. A Contributor `MUST` have at least one property. A Contributor is RECOMMENDED to have `title` property and MAY contain `givenName`, `familyName`, `path`, `email`, `role`, and `organization` properties. An example of the object structure is as follows:

```javascript
```json
"contributors": [{
"title": "Joe Bloggs",
"email": "[email protected]",
Expand All @@ -272,13 +272,19 @@ The people or organizations who contributed to this Data Package. It `MUST` be a
}]
```

- `title`: name/title of the contributor (name for person, name/title of organization)
- `title`: name of the contributor.
- `givenName`: name a person has been given, if the contributor is a person.
- `familyName`: familial name that a person inherits, if the contributor is a person.
- `path`: a fully qualified http URL pointing to a relevant location online for the contributor
- `email`: An email address
- `role`: a string describing the role of the contributor. It's `RECOMMENDED` to be one of: `author`, `publisher`, `maintainer`, `wrangler`, and `contributor`. Defaults to `contributor`.
- Note on semantics: use of the "author" property does not imply that that person was the original creator of the data in the data package - merely that they created and/or maintain the data package. It is common for data packages to "package" up data from elsewhere. The original origin of the data can be indicated with the `sources` property - see above.
- `organization`: a string describing the organization this contributor is affiliated to.

References:

- [Citation Style Language](https://citeproc-js.readthedocs.io/en/latest/csl-json/markup.html#name-fields)

##### `keywords`

An Array of string keywords to assist users searching for the package in catalogs.
Expand Down
180 changes: 81 additions & 99 deletions content/docs/specifications/table-schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -640,49 +640,31 @@ primary key is equivalent to adding `required: true` to their
The `primaryKey` entry in the schema `object` is optional. If present it specifies
the primary key for this table.

The `primaryKey`, if present, `MUST` be:

- Either: an array of strings with each string corresponding to one of the
field `name` values in the `fields` array (denoting that the primary key is
made up of those fields). It is acceptable to have an array with a single
value (indicating just one field in the primary key). Strictly, order of
values in the array does not matter. However, it is `RECOMMENDED` that one
follow the order the fields in the `fields` has as client applications `MAY`
utilize the order of the primary key list (e.g. in concatenating values
together).
- Or: a single string corresponding to one of the field `name` values in
the `fields` array (indicating that this field is the primary key). Note that
this version corresponds to the array form with a single value (and can be
seen as simply a more convenient way of specifying a single field primary
key).
The `primaryKey`, if present, `MUST` be an array of strings with each string corresponding to one of the field `name` values in the `fields` array (denoting that the primary key is made up of those fields). It is acceptable to have an array with a single value (indicating just one field in the primary key). Strictly, order of values in the array does not matter. However, it is `RECOMMENDED` that one follow the order the fields in the `fields` has as client applications `MAY` utilize the order of the primary key list (e.g. in concatenating values together).

Here's an example:

"fields": [
{
"name": "a"
},
...
],
"primaryKey": "a"

Here's an example with an array primary key:
```json
"schema": {
"fields": [
{
"name": "a"
},
{
"name": "b"
},
{
"name": "c"
},
...
],
"primaryKey": ["a", "c"]
}
```

"schema": {
"fields": [
{
"name": "a"
},
{
"name": "b"
},
{
"name": "c"
},
...
],
"primaryKey": ["a", "c"]
}
:::note[Backward Compatibility]
Data consumer MUST support the `primaryKey` property in a form of a single string e.g. `primaryKey: a` which was a part of the `v1.0` of the specification.
:::

### Unique Keys

Expand Down Expand Up @@ -735,89 +717,89 @@ They are directly modelled on the concept of foreign keys in SQL.
The `foreignKeys` property, if present, `MUST` be an Array. Each entry in the
array `MUST` be a `foreignKey`. A `foreignKey` `MUST` be a `object` and `MUST` have the following properties:

- `fields` - `fields` is a string or array specifying the
- `fields` - `fields` is an array of strings specifying the
field or fields on this resource that form the source part of the foreign
key. The structure of the string or array is as per `primaryKey` above.
key. The structure of the array is as per `primaryKey` above.
- `reference` - `reference` `MUST` be a `object`. The `object`
- `MUST` have a property `resource` which is the name of the resource within
the current data package (i.e. the data package within which this Table
Schema is located). For self-referencing foreign keys, i.e. references
between fields in this Table Schema, the value of `resource` `MUST` be `""`
(i.e. the empty string).
- `MUST` have a property `fields` which is a string if the outer `fields` is a
string, else an array of the same length as the outer `fields`, describing the
field (or fields) references on the destination resource. The structure of
the string or array is as per `primaryKey` above.
- `MUST` have a property `fields` which is an array of strings of the same length as the outer `fields`, describing the field (or fields) references on the destination resource. The structure of the array is as per `primaryKey` above.

Here's an example:

```javascript
// these are resources inside a Data Package
"resources": [
{
"name": "state-codes",
"schema": {
"fields": [
{
"name": "code"
}
]
}
},
{
"name": "population-by-state"
"schema": {
"fields": [
{
"name": "state-code"
}
...
],
"foreignKeys": [
{
"fields": "state-code",
"reference": {
"resource": "state-codes",
"fields": "code"
}
```json
"resources": [
{
"name": "state-codes",
"schema": {
"fields": [
{
"name": "code"
}
]
}
},
{
"name": "population-by-state",
"schema": {
"fields": [
{
"name": "state-code"
}
...
],
"foreignKeys": [
{
"fields": ["state-code"],
"reference": {
"resource": "state-codes",
"fields": ["code"]
}
]
...
}
]
...
```

An example of a self-referencing foreign key:

```javascript
"resources": [
{
"name": "xxx",
"schema": {
"fields": [
{
"name": "parent"
},
{
"name": "id"
}
],
"foreignKeys": [
{
"fields": "parent"
"reference": {
"resource": "",
"fields": "id"
}
```json
"resources": [
{
"name": "xxx",
"schema": {
"fields": [
{
"name": "parent"
},
{
"name": "id"
}
],
"foreignKeys": [
{
"fields": ["parent"],
"reference": {
"resource": "",
"fields": ["id"]
}
]
}
}
]
}
]
}
]
```

**Comment**: Foreign Keys create links between one Table Schema and another Table Schema, and implicitly between the data tables described by those Table Schemas. If the foreign key is referring to another Table Schema how is that other Table Schema discovered? The answer is that a Table Schema will usually be embedded inside some larger descriptor for a dataset, in particular as the schema for a resource in the resources array of a [Data Package][dp]. It is the use of Table Schema in this way that permits a meaningful use of a non-empty `resource` property on the foreign key.

[dp]: http://specs.frictionlessdata.io/data-package/

:::note[Backward Compatibility]
Data consumer MUST support the `foreignKey.fields` and `foreignKey.reference.fields` properties in a form of a single string e.g. `fields: a` which was a part of the `v1.0` of the specification.
:::

## Appendix: Related Work

Table Schema draws content and/or inspiration from, among others, the following specifications and implementations:
Expand Down
4 changes: 4 additions & 0 deletions profiles/dictionary/common.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,10 @@ contributor:
"$ref": "#/definitions/path"
email:
"$ref": "#/definitions/email"
givenName:
type: string
familyName:
type: string
organization:
title: Organization
description: An organizational affiliation for this contributor.
Expand Down

0 comments on commit 0834e4a

Please sign in to comment.