Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Commit

Permalink
Replace most of README with links to documentation (close #2)
Browse files Browse the repository at this point in the history
  • Loading branch information
adatzer committed Aug 25, 2022
1 parent 8be8ed9 commit 02e69fc
Showing 1 changed file with 40 additions and 21 deletions.
61 changes: 40 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# example-schema-registry

[![License][license-image]][license]
[![Discourse posts][discourse-image]][discourse]

## Overview

Expand Down Expand Up @@ -30,7 +31,7 @@ In order to start sending a new event or context type into Snowplow, you first n

Note that if you have JSON data already and you want to create a corresponding schema, you can do so using the [Schema Guru CLI][schema-guru-github] tool.

Once you have your schema, make sure to validate it using [igluctl]:
Once you have your schema, make sure to validate it using [Igluctl][igluctl]:

```
$ /path/to/igluctl lint /path/to/schemas/com.mycompany/my_new_event_or_context
Expand Down Expand Up @@ -109,9 +110,9 @@ Note that you also can pass credentials via configuration file or environment va

Useful resources

* [Iglu schema repository 0.1.0 release blog post](http://snowplowanalytics.com/blog/2014/07/01/iglu-schema-repository-released/)
* [Iglu central](https://github.com/snowplow/iglu-central) - centralized registry for all the schemas hosted by the Snowplow team
* [Iglu](https://github.com/snowplow/iglu) - respository with both Iglu server and client libraries
* [Iglu schema repository 0.1.0 release blog post][schema-repo-blog]
* [Iglu central][iglu-central] - centralized registry for all the schemas hosted by the Snowplow team
* [Iglu][iglu] - umbrella repository for the Iglu ecosystem


## 3. Creating the JSON Path files and SQL table definitions
Expand Down Expand Up @@ -191,7 +192,7 @@ In both cases (custom unstructured events and contexts), the data is sent in as

For more detail, please see the technical documentation for the specific tracker you're implementing.

Note: we recommend testing that the data you're sending into Snowplow conforms to the schemas you've defined and uploaded into Iglu, before pushing updates into production. This [online JSON schema validator](http://www.jsonschemavalidator.net/) is a very useful resource for doing so.
Note: we recommend testing that the data you're sending into Snowplow conforms to the schemas you've defined and uploaded into Iglu, before pushing updates into production. This [online JSON schema validator][schema-validator] is a very useful resource for doing so.

We also recommend testing that the events are sent successfully using Snowplow-Mini. You do this by configuring the collector in the tracker to `$SNOWPLOW_MINI_IP` and then logging onto `http://$SNOWPLOW_MINI_IP/home/` to review the results e.g. in Kibana. (Follow the links on the page.)

Expand All @@ -201,30 +202,30 @@ When you use Snowplow, the schema for each event and context lives with the data

If you want to change your schema over time, you will need to:

1. Create a new jsonschema file. Depending on how different this is to your current version, you will need to give it the appropriate version number. The [SchemaVer][schema-ver] specification we use when versioning data schemas can be found [here][schema-ver]
1. Create a new jsonschema file. Depending on how different this is to your current version, you will need to give it the appropriate version number. The [SchemaVer][schema-ver-blog] specification we use when versioning data schemas can be found [here][semver]
2. Rerun `igluctl static generate --with-json-paths` to update the jsonpath file, sql table definition and create a SQL migration script. Note that you'll need to add the `--force` flag to ensure that the updated jsonpath and sql table definition files overwrite the existing files.
3. Start sending data into Snowplow using the new schema version (i.e. update the Iglu reference to point at the new version e.g. `2-0-0` or `1-0-1` rather than `1-0-0`). Note that you will continue to be able to send in data that conforms to the old schema at the same time. In the event that you have an event with two different major schema definitions, each event version will be loaded into a different Redshift table

## Additional resources

Documentation on jsonschemas:

* Other example jsonschemas can be found in [Iglu Central](https://github.com/snowplow/iglu-central/tree/master/schemas). Note how schemas are namespaced in different folders
* [Schema Guru][schema-guru-github] is a [command line tool][schema-guru-github] for programmatically generating schemas from existing JSON data
* [Snowplow 0.9.5 release blog post](http://snowplowanalytics.com/blog/2014/07/09/snowplow-0.9.5-released-with-json-validation-shredding/), which gives an overview of the way that Snowplow uses jsonschemas to process, validate and shred unstructured event and custom context JSONs
* It can be useful to test jsonschemas using online validators e.g. [this one](http://jsonschemalint.com/draft4/)
* [json-schema.org](http://json-schema.org/) contains links to the actual jsonschema specification, examples and guide for schema authors
* The original specification for self-describing JSONs, produced by the Snowplow team, can be found [here](http://snowplowanalytics.com/blog/2014/05/15/introducing-self-describing-jsons/)
* Other example jsonschemas can be found in [Iglu Central][iglu-central]. Note how schemas are namespaced in different folders
* [Schema Guru][schema-guru-github] is a command line tool for programmatically generating schemas from existing JSON data
* [Snowplow 0.9.5 release blog post][versioning-release-blog], which gives an overview of the way that Snowplow uses jsonschemas to process, validate and shred unstructured event and custom context JSONs
* It can be useful to test jsonschemas using online validators e.g. [this one][schema-validator]
* [json-schema.org][json-schema] contains links to the actual jsonschema specification, examples and guide for schema authors
* The original specification for self-describing JSONs, produced by the Snowplow team, can be found [here][self-desc-blog]

Documentation on jsonpaths:

* Example jsonpath files can be found in [Iglu central](https://github.com/snowplow/iglu-central/tree/master/jsonpaths). Note that the corresponding jsonschema definitions are also stored in [Iglu central](https://github.com/snowplow/iglu-central/tree/master/schemas)
* Amazon documentation on jsonpath files can be found [here](http://docs.aws.amazon.com/redshift/latest/dg/copy-usage_notes-copy-from-json.html)
* Example jsonpath files can be found in [Iglu central][iglu-central-jsonpaths]. Note that the corresponding jsonschema definitions are also stored in [Iglu central][iglu-central-schemas].
* Amazon documentation on jsonpath files can be found [here][aws-copy-json]

Documentation on creating tables in Redshift:

* Example Redshift table definitions can be found on the [Snowplow repo](https://github.com/snowplow/snowplow/tree/master/4-storage/redshift-storage/sql). Note that corresponding jsonschema definitions are stored in [Iglu central](https://github.com/snowplow/iglu-central/tree/master/schemas)
* Amazon documentation on Redshift create table statements can be found [here](http://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_NEW.html). A list of Redshift data types can be found [here](http://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html)
* Example Redshift table definitions can be found on the [Snowplow repo][snowplow-redshift-sql].
* Amazon documentation on Redshift create table statements can be found [here][redshift-create-table]. A list of Redshift data types can be found [here][redshift-data-types].

## Copyright and license

Expand All @@ -242,8 +243,26 @@ limitations under the License.
[license-image]: https://img.shields.io/badge/license-Apache--2-blue.svg?style=flat
[license]: https://www.apache.org/licenses/LICENSE-2.0

[aws-credentials]: http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#config-settings-and-precedence
[schema-guru-online]: http://schemaguru.snowplowanalytics.com/
[schema-guru-github]: https://github.com/snowplow/schema-guru?_sp=44dbe9a530cc476d.1436355830779
[schema-ver]: http://snowplowanalytics.com/blog/2014/05/13/introducing-schemaver-for-semantic-versioning-of-schemas/
[igluctl]: https://github.com/snowplow/iglu/wiki/Igluctl
[discourse-image]: https://img.shields.io/discourse/posts?server=https%3A%2F%2Fdiscourse.snowplowanalytics.com%2F
[discourse]: https://discourse.snowplowanalytics.com/

[igluctl]: https://docs.snowplowanalytics.com/docs/pipeline-components-and-applications/iglu/igluctl-2/
[iglu-central]: https://docs.snowplowanalytics.com/docs/pipeline-components-and-applications/iglu/iglu-repositories/iglu-central/
[iglu-central-jsonpaths]: https://github.com/snowplow/iglu-central/tree/master/jsonpaths
[iglu-central-schemas]: https://github.com/snowplow/iglu-central/tree/master/schemas
[iglu]: https://docs.snowplowanalytics.com/docs/pipeline-components-and-applications/iglu/
[schema-guru-github]: https://github.com/snowplow/schema-guru
[snowplow-redshift-sql]: https://github.com/snowplow/snowplow/tree/master/4-storage/redshift-storage/sql

[aws-credentials]: https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#config-settings-and-precedence
[schema-validator]: https://www.jsonschemavalidator.net/
[semver]: https://semver.org/
[json-schema]: https://json-schema.org/
[aws-copy-json]: https://docs.aws.amazon.com/redshift/latest/dg/copy-usage_notes-copy-from-json.html
[redshift-create-table]: https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_NEW.html
[redshift-data-types]: https://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html

[schema-ver-blog]: https://snowplowanalytics.com/blog/2014/05/13/introducing-schemaver-for-semantic-versioning-of-schemas/
[versioning-release-blog]: https://snowplowanalytics.com/blog/2014/07/09/snowplow-0-9-5-released-with-json-validation-shredding/
[self-desc-blog]: https://snowplowanalytics.com/blog/2014/05/15/introducing-self-describing-jsons/
[schema-repo-blog]: https://snowplowanalytics.com/blog/2014/07/01/iglu-schema-repository-released/

0 comments on commit 02e69fc

Please sign in to comment.