-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle backend data changes #40
Comments
Thanks for opening the discussion @nyurik! I think we definitely need to pass any notification back to the caller. For example, in Stadia Maps' own usage, we'd have a (potentially) a few layers of caches to invalidate if we want to support data updates without reloading. If then, the caller wants to retry, it's easy enough with tooling. Regarding the HTTP/S3 mechanisms, I'm not sure I would like to see a few concrete use cases before we design an API around it so we actually know what user needs are. |
@lseelenbinder I am ok to simply return an error for now. I believe etag is fairly universal, but I agree it might not cover all cases. How about this API, without any other code: use pmtiles::reqwest::header::HeaderName;
// we already re-export reqwest, so it is easy for users to pass
// `Some(pmtiles::reqwest::header::ETAG)`
// If `None`, no header monitoring
HttpBackend::try_from(..., monitor_header: Option<HeaderName>) {...}
// For more advanced cases, we MAY eventually introduce
// some callback fn, but YAGNI until needed. So it is totally up to the user to handle or not handle this, and passing P.S. Note that there is no way to reset |
In case of an HTTP or S3 bucket backends, the pmtiles file could get modified while being used, and should be handled properly.
etag
(if returned), and when modified, allread
operations should return a new error codeIn order to make this possible, the backend API would need to change. Currently, we have
read
andread_exact
. Most of the time,read_exact
is used. Theread
is actually only used once - to read the initial block (main directory). I propose we convertThis way each backend can self initialize and read first block of data, and while doing it, may also set some monitors or get etag data - returning both the new backend instance and the first block of data. We may want to make all backend creation non-public, but this is not certain yet.
The next step is how to handle the actual data change. The actual backend will return some new
PmtError::UnderlyingDataModified
(naming?), but we need a new logic for thefn get_tile(...)
. It could simply pass through that error to the caller, and let the caller handle it (e.g. caller may close and re-open the backend, invalidating the cache as well). Or we could do that insideget_tile
, but the logic might get tricky - for example backend requests could start hitting two load-balanced endpoints that just happened to have different versions of the same file, thus producing differentetag
values.One possible solution is to introduce a new
get_tile_with_retry(&mut self, ...)
-- allowingself
to be modified, but I don't think its a workable solution -- most of the users would want to parallelize this call, so&mut self
is not possible. Instead, we may need to introduce a thicker wrapper that stores backend inside anArc<RwLock<Backend>>
- in which case we could still useget_tile(&self,...)
, but inside it will get a read lock, get the tile, and re-create the backend if it is outdated.The problem here is that the caller may still need to know if the backend was modified in order to reset some other internal cache
The text was updated successfully, but these errors were encountered: