-
Notifications
You must be signed in to change notification settings - Fork 6
Update CRS for HLS Events Collections #147
Comments
The HLS events collection assets were reprocessed and ingested into new -reprocessed collections. Delta-config has been updated to use the reprocessed data and I will delete the original events collections after confirming there are no users. Here is the reprocessing config and method import os
import boto3
import rasterio
from rasterio.io import MemoryFile
from rio_cogeo.cogeo import cog_translate, cog_validate
from rio_cogeo.profiles import cog_profiles
# Config and COG profile settings
blocksize = 256
config = dict(GDAL_NUM_THREADS="ALL_CPUS", GDAL_TIFF_OVR_BLOCKSIZE=str("128"))
output_profile = cog_profiles.get("deflate")
output_profile.update(
dict(blockxsize=str(blocksize), blockysize=str(blocksize), predictor="2")
)
def reprocess_and_upload_cog(s3_bucket:str, s3_src_key:str, s3_out_key:str) -> None:
"""Download geotiff from s3, add datum, and upload new tif to s3"""
src_filename = os.path.basename(s3_src_key)
temp_filename = f"/tmp/{src_filename}"
# Download raw
s3_client.download_file(
Bucket=s3_bucket,
Key=src_key,
Filename=temp_filename,
)
try:
assert(os.path.exists(temp_filename))
with rasterio.open(temp_filename, "r+") as src:
src.crs = src.crs.to_proj4() + "+datum=WGS84"
# Open a destination memory file for the COG generated by cog_translate
with MemoryFile() as dst:
cog_translate(
src,
dst.name,
output_profile,
in_memory=True,
config=config,
forward_band_tags=True,
quiet=False
)
assert cog_validate(dst.name)[0]
s3_client.upload_fileobj(
dst,
s3_bucket,
s3_out_key,
)
print(f"Uploaded {dst.name} to {s3_out_key}")
except Exception as e:
print(e)
if os.path.exists(temp_filename):
os.remove(temp_filename) |
Update: It looks like I reprocessed without the last optimization we tested (blocksize) see slack. This what should have been run. I may carefully reprocess (and re-verify) these after hours. Will update this issue with any changes.
|
tl;dr I investigated a blocksize change but ultimately did not change the format (a second time) for the assets in the hls reprocessed collections I re-ran the reprocessing with blocksize and overview blocksize 512 for two items and then compared the network response timing against items processed with blocksize 256 and overview blocksize 128 and saw no significant improvement. So I reprocessed the two test items to match the blocksize and overview settings of all of the other items in the reprocessed collections. All items in both of the hls reprocessed settings have this configuration and output profile (processing matches snippet provided in earlier comment).
|
Epic
#89
Description
Update the CRS info in the UAH-hosted COG assets of the HLS EJ subset collections to improve loading time.
Collections
hls-l30-002-ej
hls-s30-002-ej
Migrate to MCP datastore
Migrating to MCP is not a requirement for this task but if the newly produced assets are published to the MCP datastore it allows us to preserve the 'original' COG assets in UAH during the rollout of this CRS update.
Resources
One off ingestion script for HLS events collections in issue 146
hls_hdf_to_cog + datum=WGS84
Acceptance Criteria:
Checklist:
The text was updated successfully, but these errors were encountered: