-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve portability of reproducible tarballs by replacing external tar
command with tarfile
module
#4660
base: 5.0.x
Are you sure you want to change the base?
Conversation
…o filetools.reproducible_archive_cmd
73b39e2
to
d0a55ba
Compare
--date
argument for touch
command used in reproducible tarballs
…eate archives of git repos
…_archive addition
--date
argument for touch
command used in reproducible tarballs
@boegel This one is ready. As discussed, archives will be made with |
…syBuild version 6.0
tar
command with tarfile
module
|
||
# cleanup (repo_name dir does not exist in dry run mode) | ||
remove(tmpdir) | ||
|
||
return archive_path | ||
|
||
|
||
def make_archive(dir_path, archive_name=None, archive_dir=None, reproducible=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename this to make_tar_xz
, since we're creating a very specific archive (a tarball), compressed with a hardcoded compression algorithms (XZ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe even make_reproducible_tar_xz
if isinstance(filename, dict): | ||
filename = filename['filename'] | ||
chksum_input = filename['filename'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lexming Should we be more careful here, and also use filename.get('filename', None)
to avoid crashing if the filename
key is not set?
# checksums of tarballs made by EB of git repos cannot be reliably checked prior to Python 3.9 | ||
if chksum_input_git is not None: | ||
self.log.deprecated( | ||
"Reproducible tarballs of git repos are only supported in Python 3.9+. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Reproducible tarballs of git repos are only supported in Python 3.9+. " | |
"Reproducible tarballs of Git repos are only possible when using Python 3.9+ to run EasyBuild. " |
raise EasyBuildError("git_config currently only supports filename ending in .tar.gz") | ||
file_ext = find_extension(filename, required=False) | ||
if file_ext: | ||
print_warning(f"Ignoring extension of filename '{filename}' set in git_config parameter") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this going to require cleaning up a whole bunch of easyconfigs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than printing a warning, we should enforce the use of .tar.xz
as extension here (to "ease" the transition, in some way)
a reproducible archive. Other formats like .gz are not reproducible due to | ||
arbitrary strings and timestamps getting added into their metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this is new to me, on what source is this based, do we have a reference we can point to?
# reset file permissions by applying go+u,go-w | ||
user_mode = tarinfo.mode & stat.S_IRWXU | ||
group_mode = (user_mode >> 3) & ~stat.S_IWGRP # user mode without write | ||
other_mode = group_mode >> 3 # user mode without write |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other_mode = group_mode >> 3 # user mode without write | |
other_mode = group_mode >> 3 # same as group mode |
tarinfo.mode = (tarinfo.mode & ~0o77) | group_mode | other_mode | ||
# reset ownership numeric UID/GID 0 | ||
tarinfo.uid = tarinfo.gid = 0 | ||
tarinfo.uname = tarinfo.gname = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, why does empty user/group name make sense, what does this imply?
return None | ||
|
||
if sys.version_info[0] >= 3 and sys.version_info[1] < 9: | ||
# ignore any checksum for given filename due to changes in python/cpython#90021 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# ignore any checksum for given filename due to changes in python/cpython#90021 | |
# ignore any checksum for given filename due to changes in https://github.com/python/cpython/issues/90021 |
return archive_path | ||
|
||
# TODO: replace with TarFile.add(recursive=True) when support for Python 3.6 drops | ||
# since Python v3.7 tarfile automatically orders the list of files added to the archive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a reference for this?
archive.add(filepath, arcname=file_name, recursive=False, filter=archive_filter) | ||
_log.debug("File/folder added to archive '%s': %s", archive_filename, filepath) | ||
|
||
_log.info("Archive '%s' created successfully", archive_filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_log.info("Archive '%s' created successfully", archive_filename) | |
_log.info(f"Archive '{archive_path}' created successfully") |
@lexming We need to make sure that |
raise EasyBuildError("git_config currently only supports filename ending in .tar.gz") | ||
file_ext = find_extension(filename, required=False) | ||
if file_ext: | ||
print_warning(f"Ignoring extension of filename '{filename}' set in git_config parameter") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than printing a warning, we should enforce the use of .tar.xz
as extension here (to "ease" the transition, in some way)
fixes #4657
use more portable--date
argument fortouch
catch failed commands inside the pipelinemove generation of command to make reproducible archives intro its own methodreplace harcoded pattern in tests of reproducible archives command for call tofiletools.reproducible_archive_cmd
reproducible_archive_cmd
using thetarfile
modulefiletools.make_archive()
method and related unit testfilename
argument infiletools.get_source_tarball_from_git()
to expect filenames without extensionmake_archive()
required
argument tofiletools.find_extensions()