-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nix-prefetch-git: fix determinism with leaveDotGit #4767
Conversation
714fe22
to
58f294d
Compare
👍 |
58f294d
to
26905cb
Compare
Oh, one more possible source of non-determinism: origin/HEAD. Will do some testing... |
Don't merge yet.... |
a9557f5
to
81c9513
Compare
Seems fine now (fully deterministic). Unfortunately, removing the index (already done in nixpkgs master) makes the tree "dirty" (as reported by "git describe --dirty"). |
81c9513
to
30f938b
Compare
That's great! Happy to see I'm wrong with #4752. As for the index, the non-determinism comes from timestamps. I started studying the binary format of the index (a fixed sequence that is repeated, and a checksum at the end), but it didn't seem worth it to go in that direction. Another way might be to find a way to run git with a fake time, but I don't know if it's doable. |
Add more files to the delete list: * .git/FETCH_HEAD * .git/ORIG_HEAD * .git/refs/remotes/origin/HEAD * .git/config Further, remove all remote branches, remove tags not reachable from the given 'rev', do a full repack and then garbage collect unreferenced objects. According to my testing, the result is fully deterministic. As in "any change done to the upstream repo, ahead of 'rev', will not affect the hash of the resulting 'clone'". Even changing the clone URL will not change the output hash, because .git/config is removed. A new version of git can of course change store format, but that's unavoidable. For big repositories, the repack operation may be a bit heavy. But as far as I can see there is no cheaper way to determinism.
30f938b
to
d94ca33
Compare
@madjar: Yes, I was thinking of libfaketime too. I think I'll merge this and then play with fake time. |
Pushed to master (53614cf). |
@madjar: Using libfaketime fixes the timestamps in the index. But there are more "bad things" (for determinsim) stored in the index:
dev, ino (inode number), uid and gid are not deterministic. For reference, here is the index format description: https://github.com/git/git/blob/master/Documentation/technical/index-format.txt. |
Okay, then death to the index. There's no use trying to tweak the values, because it will probably result in a dirty index as before (or worst). |
Last fix I hope: 96cacf0 ("nix-prefetch-git: run single-threaded 'git repack'"). |
Version 1.1.11 (2020-03-08) Compatibility notes: When upgrading from borg 1.0.x to 1.1.x, please note: read all the compatibility notes for 1.1.0*, starting from 1.1.0b1. borg upgrade: you do not need to and you also should not run it. borg might ask some security-related questions once after upgrading. You can answer them either manually or via environment variable. One known case is if you use unencrypted repositories, then it will ask about a unknown unencrypted repository one time. your first backup with 1.1.x might be significantly slower (it might completely read, chunk, hash a lot files) - this is due to the --files-cache mode change (and happens every time you change mode). You can avoid the one-time slowdown by using the pre-1.1.0rc4-compatible mode (but that is less safe for detecting changed files than the default). See the --files-cache docs for details. 1.1.11 removes WSL autodetection (Windows 10 Subsystem for Linux). If WSL still has a problem with sync_file_range, you need to set BORG_WORKAROUNDS=basesyncfile in the borg process environment to work around the WSL issue. Fixes: fixed potential index corruption / data loss issue due to bug in hashindex_set, NixOS#4829 Please read and follow the more detailled notes close to the top of this document. upgrade bundled xxhash to 0.7.3, NixOS#4891 0.7.2 is the minimum requirement for correct operations on ARMv6 in non-fixup mode, where unaligned memory accesses cause bus errors. 0.7.3 adds some speedups and libxxhash 0.7.3 even has a pkg-config file now. upgrade bundled lz4 to 1.9.2 upgrade bundled zstd to 1.4.4 fix crash when upgrading erroneous hints file, NixOS#4922 extract: fix KeyError for "partial" extraction, NixOS#4607 fix "partial" extract for hardlinked contentless file types, NixOS#4725 fix preloading for old (0.xx) remote servers, NixOS#4652 fix confusing output of borg extract --list --strip-components, NixOS#4934 delete: after double-force delete, warn about necessary repair, NixOS#4704 create: give invalid repo error msg if repo config not found, NixOS#4411 mount: fix FUSE mount missing st_birthtime, NixOS#4763 NixOS#4767 check: do not stumble over invalid item key, NixOS#4845 info: if the archive doesn't exist, print a pretty message, NixOS#4793 SecurityManager.known(): check all files, NixOS#4614 Repository.open: use stat() to check for repo dir, NixOS#4695 Repository.check_can_create_repository: use stat() to check, NixOS#4695 fix invalid archive error message fix optional/non-optional location arg, NixOS#4541 commit-time free space calc: ignore bad compact map entries, NixOS#4796 ignore EACCES (errno 13) when hardlinking the old config, NixOS#4730 --prefix / -P: fix processing, avoid argparse issue, NixOS#4769 New features: enable placeholder usage in all extra archive arguments new BORG_WORKAROUNDS mechanism, basesyncfile, NixOS#4710 recreate: support --timestamp option, NixOS#4745 support platforms without os.link (e.g. Android with Termux), NixOS#4901 if we don't have os.link, we just extract another copy instead of making a hardlink. support linux platforms without sync_file_range (e.g. Android 7 with Termux), NixOS#4905 Other: ignore --stats when given with --dry-run, but continue, NixOS#4373 add some ProgressIndicator msgids to code / fix docs, NixOS#4935 elaborate on "Calculating size" message argparser: always use REPOSITORY in metavar, also use more consistent help phrasing. check: improve error output for matching index size, see NixOS#4829 docs: changelog: add advisory about hashindex_set bug NixOS#4829 better describe BORG_SECURITY_DIR, BORG_CACHE_DIR, NixOS#4919 infos about cache security assumptions, NixOS#4900 add FAQ describing difference between a local repo vs. repo on a server. document how to test exclusion patterns without performing an actual backup timestamps in the files cache are now usually ctime, NixOS#4583 fix bad reference to borg compact (does not exist in 1.1), NixOS#4660 create: borg 1.1 is not future any more extract: document limitation "needs empty destination", NixOS#4598 how to supply a passphrase, use crypto devices, NixOS#4549 fix osxfuse github link in installation docs add example of exclude-norecurse rule in help patterns update macOS Brew link add note about software for automating backups, NixOS#4581 AUTHORS: mention copyright+license for bundled msgpack fix various code blocks in the docs, NixOS#4708 updated docs to cover use of temp directory on remote, NixOS#4545 add restore docs, NixOS#4670 add a pull backup / push restore how-to, NixOS#1552 add FAQ how to retain original paths, NixOS#4532 explain difference between --exclude and --pattern, NixOS#4118 add FAQs for SSH connection issues, NixOS#3866 improve password FAQ, NixOS#4591 reiterate that 'file cache names are absolute' in FAQ tests: cope with ANY error when importing pytest into borg.testsuite, NixOS#4652 fix broken test that relied on improper zlib assumptions test_fuse: filter out selinux xattrs, NixOS#4574 travis / vagrant: misc python versions removed / changed (due to openssl 1.1 compatibility) or added (3.7 and 3.8, for better borg compatibility testing) binary building is on python 3.5.9 now vagrant: add new boxes: ubuntu 18.04 and 20.04, debian 10 update boxes: openindiana, darwin, netbsd remove old boxes: centos 6 darwin: updated osxfuse to 3.10.4 use debian/ubuntu pip/virtualenv packages rather use python 3.6.2 than 3.6.0, fixes coverage/sqlite3 issue use requirements.d/development.lock.txt to avoid compat issues travis: darwin: backport some install code / order from master remove deprecated keyword "sudo" from travis config allow osx builds to fail, NixOS#4955 this is due to travis-ci frequently being so slow that the OS X builds just fail because they exceed 50 minutes and get killed by travis.
Version 1.1.11 (2020-03-08) Compatibility notes: When upgrading from borg 1.0.x to 1.1.x, please note: read all the compatibility notes for 1.1.0*, starting from 1.1.0b1. borg upgrade: you do not need to and you also should not run it. borg might ask some security-related questions once after upgrading. You can answer them either manually or via environment variable. One known case is if you use unencrypted repositories, then it will ask about a unknown unencrypted repository one time. your first backup with 1.1.x might be significantly slower (it might completely read, chunk, hash a lot files) - this is due to the --files-cache mode change (and happens every time you change mode). You can avoid the one-time slowdown by using the pre-1.1.0rc4-compatible mode (but that is less safe for detecting changed files than the default). See the --files-cache docs for details. 1.1.11 removes WSL autodetection (Windows 10 Subsystem for Linux). If WSL still has a problem with sync_file_range, you need to set BORG_WORKAROUNDS=basesyncfile in the borg process environment to work around the WSL issue. Fixes: fixed potential index corruption / data loss issue due to bug in hashindex_set, NixOS#4829 Please read and follow the more detailled notes close to the top of this document. upgrade bundled xxhash to 0.7.3, NixOS#4891 0.7.2 is the minimum requirement for correct operations on ARMv6 in non-fixup mode, where unaligned memory accesses cause bus errors. 0.7.3 adds some speedups and libxxhash 0.7.3 even has a pkg-config file now. upgrade bundled lz4 to 1.9.2 upgrade bundled zstd to 1.4.4 fix crash when upgrading erroneous hints file, NixOS#4922 extract: fix KeyError for "partial" extraction, NixOS#4607 fix "partial" extract for hardlinked contentless file types, NixOS#4725 fix preloading for old (0.xx) remote servers, NixOS#4652 fix confusing output of borg extract --list --strip-components, NixOS#4934 delete: after double-force delete, warn about necessary repair, NixOS#4704 create: give invalid repo error msg if repo config not found, NixOS#4411 mount: fix FUSE mount missing st_birthtime, NixOS#4763 NixOS#4767 check: do not stumble over invalid item key, NixOS#4845 info: if the archive doesn't exist, print a pretty message, NixOS#4793 SecurityManager.known(): check all files, NixOS#4614 Repository.open: use stat() to check for repo dir, NixOS#4695 Repository.check_can_create_repository: use stat() to check, NixOS#4695 fix invalid archive error message fix optional/non-optional location arg, NixOS#4541 commit-time free space calc: ignore bad compact map entries, NixOS#4796 ignore EACCES (errno 13) when hardlinking the old config, NixOS#4730 --prefix / -P: fix processing, avoid argparse issue, NixOS#4769 New features: enable placeholder usage in all extra archive arguments new BORG_WORKAROUNDS mechanism, basesyncfile, NixOS#4710 recreate: support --timestamp option, NixOS#4745 support platforms without os.link (e.g. Android with Termux), NixOS#4901 if we don't have os.link, we just extract another copy instead of making a hardlink. support linux platforms without sync_file_range (e.g. Android 7 with Termux), NixOS#4905 Other: ignore --stats when given with --dry-run, but continue, NixOS#4373 add some ProgressIndicator msgids to code / fix docs, NixOS#4935 elaborate on "Calculating size" message argparser: always use REPOSITORY in metavar, also use more consistent help phrasing. check: improve error output for matching index size, see NixOS#4829 docs: changelog: add advisory about hashindex_set bug NixOS#4829 better describe BORG_SECURITY_DIR, BORG_CACHE_DIR, NixOS#4919 infos about cache security assumptions, NixOS#4900 add FAQ describing difference between a local repo vs. repo on a server. document how to test exclusion patterns without performing an actual backup timestamps in the files cache are now usually ctime, NixOS#4583 fix bad reference to borg compact (does not exist in 1.1), NixOS#4660 create: borg 1.1 is not future any more extract: document limitation "needs empty destination", NixOS#4598 how to supply a passphrase, use crypto devices, NixOS#4549 fix osxfuse github link in installation docs add example of exclude-norecurse rule in help patterns update macOS Brew link add note about software for automating backups, NixOS#4581 AUTHORS: mention copyright+license for bundled msgpack fix various code blocks in the docs, NixOS#4708 updated docs to cover use of temp directory on remote, NixOS#4545 add restore docs, NixOS#4670 add a pull backup / push restore how-to, NixOS#1552 add FAQ how to retain original paths, NixOS#4532 explain difference between --exclude and --pattern, NixOS#4118 add FAQs for SSH connection issues, NixOS#3866 improve password FAQ, NixOS#4591 reiterate that 'file cache names are absolute' in FAQ tests: cope with ANY error when importing pytest into borg.testsuite, NixOS#4652 fix broken test that relied on improper zlib assumptions test_fuse: filter out selinux xattrs, NixOS#4574 travis / vagrant: misc python versions removed / changed (due to openssl 1.1 compatibility) or added (3.7 and 3.8, for better borg compatibility testing) binary building is on python 3.5.9 now vagrant: add new boxes: ubuntu 18.04 and 20.04, debian 10 update boxes: openindiana, darwin, netbsd remove old boxes: centos 6 darwin: updated osxfuse to 3.10.4 use debian/ubuntu pip/virtualenv packages rather use python 3.6.2 than 3.6.0, fixes coverage/sqlite3 issue use requirements.d/development.lock.txt to avoid compat issues travis: darwin: backport some install code / order from master remove deprecated keyword "sudo" from travis config allow osx builds to fail, NixOS#4955 this is due to travis-ci frequently being so slow that the OS X builds just fail because they exceed 50 minutes and get killed by travis. (cherry picked from commit dbff9b5)
Version 1.1.11 (2020-03-08) Compatibility notes: When upgrading from borg 1.0.x to 1.1.x, please note: read all the compatibility notes for 1.1.0*, starting from 1.1.0b1. borg upgrade: you do not need to and you also should not run it. borg might ask some security-related questions once after upgrading. You can answer them either manually or via environment variable. One known case is if you use unencrypted repositories, then it will ask about a unknown unencrypted repository one time. your first backup with 1.1.x might be significantly slower (it might completely read, chunk, hash a lot files) - this is due to the --files-cache mode change (and happens every time you change mode). You can avoid the one-time slowdown by using the pre-1.1.0rc4-compatible mode (but that is less safe for detecting changed files than the default). See the --files-cache docs for details. 1.1.11 removes WSL autodetection (Windows 10 Subsystem for Linux). If WSL still has a problem with sync_file_range, you need to set BORG_WORKAROUNDS=basesyncfile in the borg process environment to work around the WSL issue. Fixes: fixed potential index corruption / data loss issue due to bug in hashindex_set, NixOS#4829 Please read and follow the more detailled notes close to the top of this document. upgrade bundled xxhash to 0.7.3, NixOS#4891 0.7.2 is the minimum requirement for correct operations on ARMv6 in non-fixup mode, where unaligned memory accesses cause bus errors. 0.7.3 adds some speedups and libxxhash 0.7.3 even has a pkg-config file now. upgrade bundled lz4 to 1.9.2 upgrade bundled zstd to 1.4.4 fix crash when upgrading erroneous hints file, NixOS#4922 extract: fix KeyError for "partial" extraction, NixOS#4607 fix "partial" extract for hardlinked contentless file types, NixOS#4725 fix preloading for old (0.xx) remote servers, NixOS#4652 fix confusing output of borg extract --list --strip-components, NixOS#4934 delete: after double-force delete, warn about necessary repair, NixOS#4704 create: give invalid repo error msg if repo config not found, NixOS#4411 mount: fix FUSE mount missing st_birthtime, NixOS#4763 NixOS#4767 check: do not stumble over invalid item key, NixOS#4845 info: if the archive doesn't exist, print a pretty message, NixOS#4793 SecurityManager.known(): check all files, NixOS#4614 Repository.open: use stat() to check for repo dir, NixOS#4695 Repository.check_can_create_repository: use stat() to check, NixOS#4695 fix invalid archive error message fix optional/non-optional location arg, NixOS#4541 commit-time free space calc: ignore bad compact map entries, NixOS#4796 ignore EACCES (errno 13) when hardlinking the old config, NixOS#4730 --prefix / -P: fix processing, avoid argparse issue, NixOS#4769 New features: enable placeholder usage in all extra archive arguments new BORG_WORKAROUNDS mechanism, basesyncfile, NixOS#4710 recreate: support --timestamp option, NixOS#4745 support platforms without os.link (e.g. Android with Termux), NixOS#4901 if we don't have os.link, we just extract another copy instead of making a hardlink. support linux platforms without sync_file_range (e.g. Android 7 with Termux), NixOS#4905 Other: ignore --stats when given with --dry-run, but continue, NixOS#4373 add some ProgressIndicator msgids to code / fix docs, NixOS#4935 elaborate on "Calculating size" message argparser: always use REPOSITORY in metavar, also use more consistent help phrasing. check: improve error output for matching index size, see NixOS#4829 docs: changelog: add advisory about hashindex_set bug NixOS#4829 better describe BORG_SECURITY_DIR, BORG_CACHE_DIR, NixOS#4919 infos about cache security assumptions, NixOS#4900 add FAQ describing difference between a local repo vs. repo on a server. document how to test exclusion patterns without performing an actual backup timestamps in the files cache are now usually ctime, NixOS#4583 fix bad reference to borg compact (does not exist in 1.1), NixOS#4660 create: borg 1.1 is not future any more extract: document limitation "needs empty destination", NixOS#4598 how to supply a passphrase, use crypto devices, NixOS#4549 fix osxfuse github link in installation docs add example of exclude-norecurse rule in help patterns update macOS Brew link add note about software for automating backups, NixOS#4581 AUTHORS: mention copyright+license for bundled msgpack fix various code blocks in the docs, NixOS#4708 updated docs to cover use of temp directory on remote, NixOS#4545 add restore docs, NixOS#4670 add a pull backup / push restore how-to, NixOS#1552 add FAQ how to retain original paths, NixOS#4532 explain difference between --exclude and --pattern, NixOS#4118 add FAQs for SSH connection issues, NixOS#3866 improve password FAQ, NixOS#4591 reiterate that 'file cache names are absolute' in FAQ tests: cope with ANY error when importing pytest into borg.testsuite, NixOS#4652 fix broken test that relied on improper zlib assumptions test_fuse: filter out selinux xattrs, NixOS#4574 travis / vagrant: misc python versions removed / changed (due to openssl 1.1 compatibility) or added (3.7 and 3.8, for better borg compatibility testing) binary building is on python 3.5.9 now vagrant: add new boxes: ubuntu 18.04 and 20.04, debian 10 update boxes: openindiana, darwin, netbsd remove old boxes: centos 6 darwin: updated osxfuse to 3.10.4 use debian/ubuntu pip/virtualenv packages rather use python 3.6.2 than 3.6.0, fixes coverage/sqlite3 issue use requirements.d/development.lock.txt to avoid compat issues travis: darwin: backport some install code / order from master remove deprecated keyword "sudo" from travis config allow osx builds to fail, NixOS#4955 this is due to travis-ci frequently being so slow that the OS X builds just fail because they exceed 50 minutes and get killed by travis. (cherry picked from commit dbff9b5)
Remove all remote branches, remove tags not reachable from the given
'rev', do a full repack and then garbage collect unreferenced objects.
According to my testing, the result is fully deterministic. As in "any
change done to the upstream repo, ahead of 'rev', will not affect the
output hash". But only time will tell.
A new version of git can of course change store format, but that's
unavoidable.
For big repositories, the repack operation may be a bit heavy. But as
far as I can see there is no cheaper way to determinism.