Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination doesn't work for Copernicus product catalogue #641

Closed
Richienb opened this issue Feb 18, 2024 · 3 comments
Closed

Pagination doesn't work for Copernicus product catalogue #641

Richienb opened this issue Feb 18, 2024 · 3 comments

Comments

@Richienb
Copy link

Richienb commented Feb 18, 2024

First, initializing the client and creating the search over the Copernicus product catalogue:

from pystac_client import Client

catalog = Client.open("https://catalogue.dataspace.copernicus.eu/stac")

search = catalog.search(
    collections = ['SENTINEL-2'],
)

The following code doesn't automatically paginate:

for item in search.items():
    print(item)

However, the following code does automatically paginate:

from pystac import Item

url = search.url_with_parameters()

while True:
    async with session.get(url) as response:
        page = await response.json()

    if 'features' in page:
        # https://github.com/stac-utils/pystac-client/blob/1a1a2d65c736dc95fb0e16b21795a968100b00c6/pystac_client/item_search.py#L692-L694
        for item in page['features']:
            print(Item.from_dict(item, root=catalog, preserve_dict=False))

    # https://github.com/stac-utils/pystac-client/blob/1a1a2d65c736dc95fb0e16b21795a968100b00c6/pystac_client/stac_api_io.py#L310-L312
    if 'links' not in page:
        break

    for link in page['links']:
        if link['rel'] == 'next':
            url = link['href']
            break
    else:
        break
@gadomski
Copy link
Member

gadomski commented Feb 19, 2024

Looks like that server doesn't correctly support POST requests (which is what pystac-client is using in your first example), and isn't returning a links attribute:

$ curl -s -X POST https://catalogue.dataspace.copernicus.eu/stac/search --json '{"collections": ["SENTINEL-2"]}' | jq .links                                              <<<
null
$ curl -s -X GET https://catalogue.dataspace.copernicus.eu/stac/search --json '{"collections": ["SENTINEL-2"]}' | jq .links
[
  {
    "rel": "next",
    "type": "application/json",
    "href": "https://catalogue.dataspace.copernicus.eu/stac/search?page=2"
  },
  {
    "rel": "self",
    "type": "application/json",
    "href": "https://catalogue.dataspace.copernicus.eu/stac/search"
  },
  {
    "rel": "root",
    "type": "application/json",
    "href": "https://catalogue.dataspace.copernicus.eu/stac"
  }
]

As a workaround, you can provide method="GET" to search():

from pystac_client import Client

catalog = Client.open("https://catalogue.dataspace.copernicus.eu/stac")

item_search = catalog.search(
    collections=["SENTINEL-2"],
    method="GET",
    max_items=100,
    limit=10,
)

print(len(list(item_search.items())))  # <- prints 100

Note that POST is recommended but not required, so there's an argument that pystac-client should default to GET (at least when there's no intersects). I've opened #643 to discuss.

@gadomski
Copy link
Member

gadomski commented Feb 19, 2024

As an aside, https://catalogue.dataspace.copernicus.eu/stac does advertise POST search, so the server can be considered broken for POSTs.

$ curl -s https://catalogue.dataspace.copernicus.eu/stac | jq '.links[6]'                                                                                                
{
  "href": "https://catalogue.dataspace.copernicus.eu/stac/search",
  "title": "STAC search",
  "rel": "search",
  "type": "application/json",
  "method": "POST"
}

@Richienb
Copy link
Author

Richienb commented Feb 23, 2024

The response of Dataspace Copernicus support is that:

  • They are aware of this issue
  • Pagination using POST has not yet been implemented, but will be added in the future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants