You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What compression to use (zstd, snappy, brotli, etc). Talk through how not all parquet implementations support all compressions, and also how to think the compression time vs file size tradeoff. Perhaps some discussion of what works best for geospatial / common geo use cases.
Discussion of spatial ordering - like explain how the bbox column works best when you've used a r-tree or something else to sort your data, point at what different implementations do, etc. Makes sense to keep the spec barebones and flexible, but nice to provide more explanation guidance for those who are making datasets.
Partitioning - we need to figure out the _metadata files in How should metadata be written in a partitioned dataset? #79, and a best practices doc likely makes sense. But also just a more general discussion of when to split up parquet files, and things to consider when splitting them up - admin boundaries vs bbox vs ...
The filename extension recommendation (#212) arguably would fit in a best practice (though I think in the spec is fine).
Other suggestions here are welcome. I'm not the expert on these, but happy to take a crack at drafting something that others could improve.
The text was updated successfully, but these errors were encountered:
Seems like it might be time to start a 'best practices' document for topics that are outside the spec but would be good for people to know about.
Remembered this when reading #79.
Potential ideas to include:
_metadata
files in How should metadata be written in a partitioned dataset? #79, and a best practices doc likely makes sense. But also just a more general discussion of when to split up parquet files, and things to consider when splitting them up - admin boundaries vs bbox vs ...The filename extension recommendation (#212) arguably would fit in a best practice (though I think in the spec is fine).
Other suggestions here are welcome. I'm not the expert on these, but happy to take a crack at drafting something that others could improve.
The text was updated successfully, but these errors were encountered: