-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Secure snapshots S3 bucket behind CloudFront #1180
Comments
TIL multipart uploads to S3 are atomic — it doesn’t assemble the pieces into a GETable object until all parts are uploaded (see https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html). Single operation PUTs are also atomic (see https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html#ConsistencyModel) So we actually don’t need to worry about item 1 (make writes atomic)! |
Mr0grog
added a commit
that referenced
this issue
Jan 31, 2023
This adds a cloudfront distribution to serve data from our "data snaphsots" S3 bucket at `https://archives.getmyvax.org`. The goal here is mainly to prevent people from potentially driving up S3 costs by making requests we can't control or cache against the bucket. This is a first step for #1180.
Mr0grog
added a commit
that referenced
this issue
Feb 1, 2023
This adds a CloudFront distribution to serve data from our "data snaphsots" S3 bucket at `https://archives.getmyvax.org`. The goal here is mainly to prevent people from potentially driving up S3 costs by making requests we can't control or cache against the bucket. This is a first step for #1180.
Mr0grog
added a commit
to usdigitalresponse/appointment-data-insights
that referenced
this issue
Feb 1, 2023
UNIVAF historical data is now available from `archives.getmyvax.org`; the S3 bucket will no longer be publicly accessible. See usdigitalresponse/univaf#1180.
Mr0grog
added a commit
to usdigitalresponse/appointment-data-insights
that referenced
this issue
Feb 1, 2023
UNIVAF historical data is now available from `archives.getmyvax.org`; the S3 bucket will no longer be publicly accessible. See usdigitalresponse/univaf#1180.
Mr0grog
added a commit
that referenced
this issue
Feb 1, 2023
The data in the data snapshots S3 bucket is now available via CloudFront at `https://archives.getmyvax.org`, so we are ready to revoke public read access to the bucket. Fixes #1180.
Mr0grog
added a commit
that referenced
this issue
Feb 1, 2023
The data in the data snapshots S3 bucket is now available via CloudFront at `https://archives.getmyvax.org`, so we are ready to revoke public read access to the bucket. Fixes #1180.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Since requests directly to S3 buckets can get expensive quickly, we should put our “data snapshots” S3 bucket behind CloudFront and disable public access. (This recommendation from @TylerHendrickson’s very helpful review of our AWS configuraiton.)
Make writes to S3 atomic. Since writes can be slow (e.g. each(Update: it turns out streamed, multipart uploads are already atomic, so we don’t have to do anything here.)availability_log
file takes just under 20 minutes to write), there’s a reasonable concern that CloudFront could cache a partially written file. Write to a key with a prefix, then move the data to a key without the prefix (e.g. write to/in_progress/availability_log/availability_log-2022-12-12.ndjson.gz
, then move to/availability_log/availability_log-2022-12-12.ndjson.gz
). See https://github.com/usdigitalresponse/univaf/blob/main/server/scripts/availability_dump.jsAdd a CloudFront distribution that reads from the bucket. It’ll need a domain name, so we should probably use
archives.getmyvax.org
orsnapshots.getmyvax.org
.Update code for loading data in https://github.com/usdigitalresponse/appointment-data-insights. (PR: Download UNIVAF data from new public URL appointment-data-insights#4)
Make the bucket itself private instead of public.
The text was updated successfully, but these errors were encountered: