Requirement: Identification of non-raw, derived data #10

krischer · 2018-01-03T23:33:46Z

Allow for identification of non-raw, derived data (e.g. processed data, quality parameters, metadata versioning, synthetic data).

chad-earthscope · 2018-01-06T00:15:25Z

We support this and are increasingly seeing the need to clearly identify synthetic, processed and derived data.

This requirement seems like a sub-bullet to #4, which is a relatively large sub-topic in it's own right.

krischer · 2018-01-08T13:06:27Z

Any ideas how this could look like? A free-form ASCII string after the identifier proposed in #4?

crotwell · 2018-01-08T16:22:10Z

Especially for the derived data, we should be able to identify the channels that the new timeseries came from. For example some are recording latency at a receiving node of a input channel as a new timeseries. Another case, where there would be more than one derived from channel, would be deriving a North channel from a borehole instrument with non-traditional orientations.

A standard "derived from" key could be done as part of the optional/additional headers. This does somewhat mix metadata into timeseries data, but for items as simple as latency or rotations it might be acceptable, and as far as I know StationXML does not have the ability to specify this type of derivation.

I would argue that unless the processing or derivation is trivial or close to it, that it is better not to mix the determination of the codes of a new channel, an identification problem, with linking to the source channels, a metadata problem. This is especially true if the fundamental nature of the data changes, ie latency of a ground motion channel.

jmsaurel · 2018-01-11T11:40:19Z

It looks a little like the data quality flag of miniSEED2.4 (R, D, Q or M) but with extended capabilities, isn't it ?

I'm in favor of something that allows clearly to identify synthetic channels, or derived channels (ie, samples whose values from the digitizer have been modified). Maybe an extended version of the data quality flag.

I'm not in favor of placing in the data informations about where do this new data comes from. This should be kept in the metadata.

Regarding the indication of quality verifications on the data that don't affect at all the values of the samples (ie, only qualifying, or removing bad data), it could be taken by the versioning #13

krischer · 2018-01-11T11:48:03Z

A simplistic possibility would be to somehow enhance the quality codes and add two new codes for synthetic and derived data (are there other broad categories?) and then delegate further details to the arbitrary headers of #14 as proposed by @crotwell.

andres-h · 2018-01-11T14:15:00Z

Would BHZ be a "derived channel", since it is derived from HHZ?

jmsaurel · 2018-01-11T17:19:28Z

If BHZ comes directly out of the digitizer, I wouldn't call it a "derived channel", because you don't know how it's made inside. It could be derived from the HHZ, but it could come from a different filter stream.

But if BHZ is made by the acquisition software (such as SC3, for example), then it could be called "derived channel" because it's no more data than comes out straight of the digitizer box.

tim-iris · 2018-01-19T23:37:38Z

Isn't this really an issue where we are implying that we must capture provenance. If so, and I think it is, then I do not think this really belongs in the time series exchange format. Provenance is a much bigger issue and could unnecessarily complicit things. Any expansion of the Quality code should be though through very carefully.... I have concerns with this.

krischer · 2018-01-29T19:21:49Z

Summary

(Please let me know if I missed a point or misunderstood something)

This is a bit of a complicated issue. I think we agree that full and proper provenance is not in the scope of the next generation data format and must be delegated to the meta data in some form. Also where exactly this information should go in the format is not clear and there are a large number of possibilities. Thus please vote on the following issue:

Should there be a simple way to flag time series in the new format as either "raw" (whatever the exact definition of that is), "derived" (not "raw"), or "synthetic" (not based on actual recordings)? (Yes/No)

crotwell · 2018-01-29T20:39:35Z

Yes

chad-earthscope · 2018-01-29T22:47:21Z

Yes.

kaestli · 2018-01-30T10:32:31Z

No (not as a flag, as terms are not defined and overlapping.)
But such streams should have different streamIDs and different Metadata

ozym · 2018-01-30T10:57:24Z

Yes

claudiodsf · 2018-01-31T09:33:48Z

Yes, but not a single flag, since the three definitions can overlap.

ihenson-bsl · 2018-01-31T17:35:40Z

Yes

ValleeMartin · 2018-02-02T13:33:22Z

Yes but taking into account that definitions can overlap

JoseAntonioJara · 2018-02-02T16:39:13Z

No, I think this feature should be specified together with the rest of channel's metadata.

krischer added the additional requirement label Jan 3, 2018

jfclinton mentioned this issue Jan 11, 2018

Requirement: A new method of identifying a time series #4

Open

jmsaurel mentioned this issue Jan 24, 2018

Expansion and convention of the channel code #30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requirement: Identification of non-raw, derived data #10

Requirement: Identification of non-raw, derived data #10

krischer commented Jan 3, 2018

chad-earthscope commented Jan 6, 2018

krischer commented Jan 8, 2018

crotwell commented Jan 8, 2018

jmsaurel commented Jan 11, 2018

krischer commented Jan 11, 2018

andres-h commented Jan 11, 2018

jmsaurel commented Jan 11, 2018

tim-iris commented Jan 19, 2018

krischer commented Jan 29, 2018

crotwell commented Jan 29, 2018

chad-earthscope commented Jan 29, 2018

kaestli commented Jan 30, 2018

ozym commented Jan 30, 2018

claudiodsf commented Jan 31, 2018

ihenson-bsl commented Jan 31, 2018

ValleeMartin commented Feb 2, 2018

JoseAntonioJara commented Feb 2, 2018

Requirement: Identification of non-raw, derived data #10

Requirement: Identification of non-raw, derived data #10

Comments

krischer commented Jan 3, 2018

chad-earthscope commented Jan 6, 2018

krischer commented Jan 8, 2018

crotwell commented Jan 8, 2018

jmsaurel commented Jan 11, 2018

krischer commented Jan 11, 2018

andres-h commented Jan 11, 2018

jmsaurel commented Jan 11, 2018

tim-iris commented Jan 19, 2018

krischer commented Jan 29, 2018

Summary

crotwell commented Jan 29, 2018

chad-earthscope commented Jan 29, 2018

kaestli commented Jan 30, 2018

ozym commented Jan 30, 2018

claudiodsf commented Jan 31, 2018

ihenson-bsl commented Jan 31, 2018

ValleeMartin commented Feb 2, 2018

JoseAntonioJara commented Feb 2, 2018