Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add checksum timestamp changes #371

Merged
merged 4 commits into from
Nov 13, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion nmdc_runtime/api/core/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,13 @@ def hash_from_str(s: str, algo="sha256") -> str:
return getattr(hashlib, algo)(s.encode("utf-8")).hexdigest()


def sha256hash_from_file(file_path: str):
def sha256hash_from_file(file_path: str, timestamp: str):
# https://stackoverflow.com/a/55542529
h = hashlib.sha256()

timestamp_bytes = timestamp.encode('utf-8')
h.update(timestamp_bytes)
dwinston marked this conversation as resolved.
Show resolved Hide resolved

with open(file_path, "rb") as file:
while True:
# Reading is buffered, so we can read smaller chunks.
Expand Down
1 change: 1 addition & 0 deletions nmdc_runtime/api/endpoints/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -442,6 +442,7 @@ def persist_content_and_get_drs_object(
),
"access_methods": [{"access_id": drs_id}],
},
timestamp= datetime.now(tz=ZoneInfo('America/Los_Angeles')).isoformat(timespec='minutes')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes sense to me.

A string is being generated that indicates the current time (in Los Angeles, CA) down to the minute.

Docs for datetime.isoformat() say it returns a string representation of the time in ISO 8601 format.

image

Copy link
Collaborator

@eecavanna eecavanna Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend removing the space after the equals sign, since this is "keyword argument setting" as opposed to "variable assignment" (I think that's a Python convention).

- timestamp= datetime.now(...
+ timestamp=datetime.now(...

)
)
self_uri = f"drs://{HOSTNAME_EXTERNAL}/{drs_id}"
Expand Down
4 changes: 2 additions & 2 deletions nmdc_runtime/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ def put_object(filepath, url, mime_type=None):
return requests.put(url, data=f, headers={"Content-Type": mime_type})


def drs_metadata_for(filepath, base=None):
def drs_metadata_for(filepath, base=None, timestamp=None):
"""given file path, get drs metadata

required: size, created_time, and at least one checksum.
Expand All @@ -96,7 +96,7 @@ def drs_metadata_for(filepath, base=None):
)
if "checksums" not in base:
base["checksums"] = [
{"type": "sha256", "checksum": sha256hash_from_file(filepath)}
{"type": "sha256", "checksum": sha256hash_from_file(filepath, timestamp)}
]
if "mime_type" not in base:
base["mime_type"] = mimetypes.guess_type(filepath)[0]
Expand Down