Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for checking checksum of source files #214

Closed
boegel opened this issue Aug 29, 2012 · 6 comments
Closed

add support for checking checksum of source files #214

boegel opened this issue Aug 29, 2012 · 6 comments

Comments

@boegel
Copy link
Member

boegel commented Aug 29, 2012

make checksum part of dumped easyconfig file in repository; e.g.,

source_checksum=('sha1','2fd4e1c6 7a2d28fc ed849ee1 bb76e739 1b93eb12')

or

source_checksum='sha1:2fd4e1c6 7a2d28fc ed849ee1 bb76e739 1b93eb12'
@fgeorgatos
Copy link
Collaborator

Hi,

some more notes on this thread:

  1. Make code compatible across versions of Python, as follows:
    https://code.djangoproject.com/ticket/7919 # describes md5 case and should
    be similar for sha1

  2. Prefer to produce variable "_source_checksum" in output (ie. a private
    unchecked variable)
    in the dumped easyconfig file in repository; this makes it easy to both
    generate
    "production" easyconfigs via a trivial sed expression and/or, rewire it
    with some python logic
    for testing, in the lines of --amend "source_checksum=_source_checksum".

  3. Consider the situation where multiple sources constitute the "source"
    (gcc is a good example),
    therefor multiple (set of) hashes are needed.
    Of course, there's a limit how far someone can go (eg. think of patch
    files). So, I would basically
    hash anything that comes from external sites, given that git will take care
    of the integrity of the rest.

  4. I would leave the hash choice free for the user, mainly between md5/sha1
    from hashlib;
    while perhaps permitting adler32 or crc32, available from zlib library and
    maybe "size".
    For generality, I think it is optimal to call "hashlib.function" (where eg.
    function=sha224)
    and treat md5,sha1,adler32,crc32, size as special exceptions. Now, to wrap
    it up,
    we should focus on this issue on something easy like md5/sha1 and leave the
    big bang for later...
    (let everyone scratch his own itches!)

Fotis

On Wed, Aug 29, 2012 at 5:28 PM, Kenneth Hoste [email protected]:

make checksum part of dumped easyconfig file in repository; e.g.,


or

source_checksum='sha1:2fd4e1c6 7a2d28fc ed849ee1 bb76e739 1b93eb12'

 —
Reply to this email directly or view it on GitHubhttps://github.com/hpcugent/easybuild/issues/214.

@boegel
Copy link
Member Author

boegel commented Sep 7, 2012

@fgeorgatos: W.r.t. (2), I don't see the advantage of storing the checksum in a private variable _source_checksum. What's the benefit compared to just using source_checksum? How is it different from buildstats?

W.r.t. (3): we'll need a map from file name to checksum instead of a simple list, to make sure we match the correct checksum with the correct file, e.g.:

source_checksums = {'file1.tar.gz': "<checksum>", 'file2.patch': "<checksum2>" }

I fully agree with (4), i.e. leaving the checksum choice open, with a sensible default (sha or md5 makes sense).

@fgeorgatos
Copy link
Collaborator

After seeing the following, the checksum becomes a must-have for sysadmins that are fanatics of reproducibility:
http://developers.slashdot.org/story/12/09/26/1422218/malicious-phpmyadmin-served-from-sourceforge-mirror
(let's claim that routine easybuild users would fall under the reproducibility description easily!)

this was fyi; no proposed change in priorities: #99 better to have for 1.0, while this can be done anytime thereafter.

@itkovian
Copy link
Contributor

itkovian commented Feb 5, 2013

If you are going to set up a source repo for things that have been installed in the past and which you need to retain for reproducibility purposes, you should have some way to verify that these source tarballs are indeed what you are expecting and they have not been tampered with.

@fgeorgatos
Copy link
Collaborator

i am testing zsync this period, which is basically rsync stateless
server-side (http v1.1). If we store the zsync files with the hashes under
git, you get a very robust solution without needing to trust the mirrors,
and of low complexity... so, let's see how it goes in practice. F.
On Feb 5, 2013 3:47 PM, "Andy Georges" [email protected] wrote:

If you are going to set up a source repo for things that have been
installed in the past and which you need to retain for reproducibility
purposes, you should have some way to verify that these source tarballs are
indeed what you are expecting and they have not been tampered with.


Reply to this email directly or view it on GitHubhttps://github.com//issues/214#issuecomment-13132292..

@boegel
Copy link
Member Author

boegel commented Oct 24, 2014

support for checking checksums is supported since EasyBuild v1.10.1 (see #774, #777, #779, #801, #802)

@boegel boegel closed this as completed Oct 24, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants