Replies: 3 comments 2 replies
-
I believe the file size on Cloudflare R2 has a pretty small effect on overall latency. Here is an example of a small file on R2:
Here is a large file on R2 (beginning):
Here is a large file on R2 (@ 100gb mark):
So the size of the file or where you ask for the range doesn't have much effect on the latency. Cloudflare R2 in general is pretty slow, possibly because it's quite a new platform. But I use if for the distribution of the daily builds at http://maps.protomaps.com/builds/ and Zsolt is using it for distributing OpenFreeMap datasets because there are few alternatives with no egress fees. If latency is a high priority I would recommend looking at Google Cloud, DigitalOcean and Azure as storage options. Here is an example of a way to split a large file into multiple chunks by modifying the client. I don't plan on supporting splitting up a large file into multiple chunks as part of the reference implementations. Storage platforms are almost always atomic, if you upload The consequence of inconsistency across files means that if you have a program caching the PMTiles directories, like the web browser client, or a ZXY tile proxy, if one file changes then the offsets you have cached are garbage and you have no way to detect this situation. The HTTP specification is designed specifically to handle this case for single files, with the interaction of the So all the tooling around PMTiles relies on the single-file design and the HTTP standard for client programs. You might be able to get away with splitting the file in specific situations but it is going to lead to potential failure the data is ever modified or you could do something like assign a UUID to every version and make the data on storage immutable. You could design a different multi-file format outside of PMTiles that handles these situations but I believe it would be much more complex to have correct atomic behavior (you are kinda writing a database). Lennart Poettering's casync I think is quite interesting in this space. (Will continue in next comment...) |
Beta Was this translation helpful? Give feedback.
-
Interesting discussion here! So what I was trying to say to Brandon is that a simple curl is not an adequate test to check real world user experience. When I was evaluating the PMTiles approach, I've set up, configured a full screen map in a full screen browser and started scrolling around. There were some tiles which loaded instantly, some tiles which loaded somewhat slower, and some tiles which felt like taking forever, they were still not loaded after 10 seconds. What I'm saying is that even if 90% of the tiles load OK, the 5-10% which is slow destroys the experience. Now, I see from your link that there is a PM (Thomas Gauvin) who works for Cloudflare and is an enthusiast about PMTiles, so we have a chance that Cloudflare does address those problems in the long term. Maybe you can even invite him to this discussion, it's basically our only chance of raising this problem at Cloudflare. So for example, on the https://maps.protomaps.com/builds/ page, on a random map, I see these requests regularly. This is actually not so bad, but definitely not a smooth experience. |
Beta Was this translation helpful? Give feedback.
-
For additional data, I've just performed 1,000 random range requests at https://r2-public.protomaps.com/protomaps-sample-datasets/terrarium-z12.pmtiles. Here are the latency results for September 26, 2024, in Montreal, Canada: |
Beta Was this translation helpful? Give feedback.
-
Hi,
I've been following the recent discussions on PMTiles in Cloudflare that emerged from a post on Hacker News yesterday (https://news.ycombinator.com/item?id=41635592). These discussions have raised a couple of questions about optimization strategies in the Cloudflare context, particularly regarding latency issues and data retrieval costs.
Firstly, considering the latency challenges with large files on Cloudflare R2, would it be advisable to segment planet tile datasets into smaller files, each under 500 MB? If this approach has been previously attempted in similar contexts, what were the results?
Secondly, since tiles are packed using Hilbert curves—which suggest that neighboring tiles are also physically close in the storage blob—and given that Cloudflare offers free bandwidth but charges per request, has there been any exploration into the feasibility of batch requesting multiple tiles in a single query, tailored to the user's map view?
Thank you for your insights
Beta Was this translation helpful? Give feedback.
All reactions