Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add clarification on data updates #142

Open
tomkralidis opened this issue May 22, 2024 · 2 comments
Open

add clarification on data updates #142

tomkralidis opened this issue May 22, 2024 · 2 comments
Assignees

Comments

@tomkralidis
Copy link
Collaborator

Add to section for data publishers:

If a data publisher issues an update on a data granule, the associated WNM should:

  • keep the same properties.data_id
  • update the link object to use rel=update to refer to the updated granule (not rel=canonical)
  • update properties.pubtime
@6a6d74
Copy link
Collaborator

6a6d74 commented Jun 18, 2024

Also add section for Implementation and operation of a Global Service / Global Cache.

Expected behaviour is described below:

The first aim of updates is to make sure that the (potentially) faulty data in the initial message will not be used by downstream processing by the users.
It means that, the initial download link in the cache should either be made unavailable or the file is updated 'in place' and reuse the same link.

Using an example:

        "links": [
            {
               "rel": "canonical",
               "type": "application/x-bufr",
               "href": "[http://wis2bra.inmet.gov.br/data/2024-05-22/wis/br-inmet/data/core/weather/surface-based-observations/synop/WIGOS_0-76-0-1709500000000161_20240522T030000.bufr4](http://wis2bra.inmet.gov.br/data/2024-05-22/wis/br-inmet/data/core/weather/surface-based-observations/synop/WIGOS_0-76-0-1709500000000161_20240522T030000.bufr4)",
               "length": 240
            },
            {
               "rel": "update",
               "type": "application/x-bufr",
               "href": "[http://wis2bra.inmet.gov.br/data/2024-05-22/wis/br-inmet/data/core/weather/surface-based-observations/synop/WIGOS_0-76-0-1709500000000161_20240522T030000.bufr4](http://wis2bra.inmet.gov.br/data/2024-05-22/wis/br-inmet/data/core/weather/surface-based-observations/synop/WIGOS_0-76-0-1709500000000161_20240522T030000.bufr4)",
               "length": 240
            },

when you get this message from Brazil the cache must:

  • redownload the data using the link
  • 'replace' the original file so
    a. delete the old one and create a new file with a new link
    b. replace the existing link and overwrite the content
  • change all the links in the WNM that sends back to the origin
  • publish a new message with a new id and the same data_id.

So, using this example above, the canonical and the update href must be changed to send to your cache.

If there is only a link with rel=update the GC should add a copy of the link to the WNM, albeit with rel=canonical

Note that client applications should always be looking for a link with rel=update first, then rel=canonical.

@antje-s
Copy link

antje-s commented Jun 19, 2024

I have the following questions about the behaviour described:

  1. Should the cache really change the filename of updates to match the canonical when the centre sends a new, different filename? In my view, the responsibility here should lie with the centre that sends the updates. The same filename should already be used here.

  2. In my opinion, no canonical is necessary for rel=update and rel=delete. Since rel=update and rel=delete are searched for first and rel=canonical is only used if there are no hits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants