You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These needed to be expanded to cover the use cases discussed in #1, which I quite here
Some questions we need to resolve to move forward with this idea, and my initial responses, are:
What are the workflows we want to test?
Write random data to cloud
Read back data
Copy data
Rechunk data
How big does the test need to be in order to be realistic?
My sense: > 100 GB
Which combinations of libraries do we want to include?
dask (w/o xarray)
xarray (via dask)
gcsfs
s3fs
adlfs
rechunker
Also, my experience is that coarse-graining operations tend to cause many problems. Unlike a time-mean the output will be too large for memory, but the data reduction is enough that the input and output chunk-sizes should differ.
Generally, I think "writes" are less robust than "reads", but the latter is more frequently used by this community.
How big does the test need to be in order to be realistic?
I think so, but I need a better sense in the error rate per GB/HTTP request
A related issue is how do we want to handle Dask. Ideally we would parametrize the tests over different Dask schedulers, including distributed schedulers (see #4 (comment)). Noah mentioned Apache Beam in #1. Do we want to include Beam here?
The text was updated successfully, but these errors were encountered:
@nbren12 got us started in #2 with some basic tests: https://github.com/pangeo-data/pangeo-integration-tests/blob/main/test_gcs.py
These needed to be expanded to cover the use cases discussed in #1, which I quite here
A related issue is how do we want to handle Dask. Ideally we would parametrize the tests over different Dask schedulers, including distributed schedulers (see #4 (comment)). Noah mentioned Apache Beam in #1. Do we want to include Beam here?
The text was updated successfully, but these errors were encountered: