Skip to content

Latest commit

 

History

History
72 lines (58 loc) · 2.35 KB

README.pioneer.md

File metadata and controls

72 lines (58 loc) · 2.35 KB

About Pioneer Schemas

Pioneer studies send encrypted data, which is handled separately by the pipeline, and is stored apart from the main corpus of data with highly restricted access controls.

The special handling of Pioneer data requires a slightly different approach compared with other schemas.

The namespace should always be prefixed with pioneer-. This identifies data for the special handling as described above.

The schema should describe the structure of the encrypted inner content, rather than the raw incoming ping.

The raw incoming pings are described by the pioneer-study schema (example document). Specifically, pioneer payloads are expected to be of the form:

{
 "encryptedData": "<encrypted data>",
 "encryptionKeyId": "<pioneer key id>",
 "pioneerId": "<UUID>",
 "studyName": "<[email protected]>",
 "schemaNamespace": "<namespace>",
 "schemaName": "<docType>",
 "schemaVersion": <docVersion>
}

Where the value of the "encryptedData" key will be decrypted upon ingestion, then validated against the schema specified by <namespace>, <docType> and <docVersion>.

Imagine some pioneer data with namespace of pioneer-lorem, docType of ipsum, and a schema version of 1. If the data looked like this:

{
	"foo": 1,
	"bar": true
}

The resulting schema would be added as a template in templates/pioneer-lorem/ipsum.1.schema.json, and rendered at schemas/pioneer-lorem/ipsum.1.schema.json and would have the following contents:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "foo": {
      "type": "integer"
    },
    "bar": {
      "type": "boolean"
    }
  },
  "additionalProperties": false
}

When submitted, the raw incoming ping would look like:

{
  ... outer document structure
  "payload": {
   "encryptedData": "eyJhb...rrsAQ",
   "encryptionKeyId": "pioneer-20170905",
   "pioneerId": "1076d9e9-152a-465d-85bf-d3ac056beb8d",
   "studyName": "[email protected]",
   "schemaNamespace": "pioneer-lorem",
   "schemaName": "ipsum",
   "schemaVersion": 1
  }
  ...
}

The remainder of the standard practices for adding schemas apply.