-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Requirement: Include a CRC (cyclic redundancy check) of the complete record #12
Comments
This requirement was discussed in change proposal 6 to the 2016 strawman. The Quaterra CRC algorithm was offered for use in change proposal 25 to the 2016 strawman. In the previous format specifications from IRIS I suggested the adoption of the CRC-32C (Castagnoli) algorithm for the following reasons:
|
I think the type of CRC is at this stage an irrelevant implementation detail. The question is rather if there should be a CRC of complete record, partial record or no record-level CRC at all. I'm in favor of partial record CRC. |
The type of CRC should probably also be discussed as this stage as stage 3 of the design hopefully only deals with complete implementations and the type of CRC is a detail that could be sorted out by then. Thus we have to things to discuss here: Which type of checksum calculation and what parts should be checksummed. Excerpt from @andres-h link:
I personally disagree with this and think that everything should be checksummed - its cheap and also serves as a safety mechanism when touching any part of a record to some extent. It could also be two separate 16bit checksums - one for the data, one for the header. The risk of false positives might still be small enough to have to worry about it. |
I agree that everything should be checksummed, but having a single checksum does not match with the requirement (?) of being able to modify the records in the datacentre (adding QC, etc.). As a user, I want to check if the CRC matches the one that was generated by the digitizer or not. Also, I want to know which changes, if any, were done in the datacentre. Assuming that digitizers support NGF directly, similar to mseed2/seedlink (requirement?). |
I feel the checksum should be over the whole record and modification of a record should force a checksum recalculation. The purpose of this checksum is simply to detect corruption of data during transmission or storage, not to provide "providence" of the data back to the digitizer. While that is an important concept, it is a much larger problem and to do it correctly it needs to be done separately from the timeseries format. For this purpose I think simpler is better, and so the CRC-32 makes most sense. |
I agree with @crotwell. The most value of adding a CRC is to check whether the record has been corrupted during transmission or storage. Even if multiple CRCs could be used in a provenance scheme, requiring a reader to calculate multiple CRCs to do the "has this record been corrupted" check is not justified. |
Summary(Please let me know if I missed a point or misunderstood something) There is agreement that we want CRC but not what algorithm or what "type" of CRC. Technically it is also clear that the CRC field must be set to some pre-determined value (or ignored) for the actual calculation of the CRC. Thus please vote on:
|
1 complete record |
|
|
|
Complete record
CRC-32 or any other lightweight algorithm |
1 complete record |
|
|
Include a CRC (cyclic redundancy check) of the complete record.
The text was updated successfully, but these errors were encountered: