-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to batched fsync by default #3492
Conversation
Are the test failures new? Would you like me to investigate? |
I'm investigating, please expect an update shortly. |
Yes, those test failures are new. Please note that I turned on the default for all platforms, kind of out of curiosity. (The commit message of 64204c8 is a bit misleading there, it was a spur of the moment kind of idea to simply turn it on in
That would be fantastic!
Thank you! |
This change is quite significant. Can we plan to switch the default only after the 2.34 release so we have time to do a thorough performance analysis across lots of environments? I think it would be good to take @neerajsi-msft's changes for 2.34, but off by default for a release cycle. |
I agree.
I guess that's a safer approach, but let's see whether we can identify the bugs that prevent us from enabling this globally, anyway. It would be good to flesh this out before v2.34. |
I submitted a fix to the git mailing list over here: https://lore.kernel.org/git/[email protected]/ I tested on Ubuntu locally and passed. It would be great if you can merge the fix and see if the CI passes here. Thanks! |
Thank you for this quick turnaround! I applied the patches and pushed the branch. Let's see what CI thinks. About the patch series, I fear that you will need to rephrase the commit messages because the original patch made it into And alternative would be to ask Junio to kick the topic out of I'll reply on the Git mailing list with the same content. |
Hmm. @neerajsi-msft could you have a look? I saw a potential crash in |
@dscho: You might need to kill the CI jobs. There is some problem where we're looping and creating tons of files inside the git directory. FYI, the problem doesn't repro in the upstream git when built with the gfw sdk. So there's probably something specific to an interaction wtih downstream patches. |
Thank you! |
BTW if you cherry-pick the commit making |
Yeah, I did enable batch mode by default. I've narrowed it down to a problem in tmp_objdir_migrate. For some reason we're continuously trying to make top-level obj file path at increasing levels of nesting. I'm pretty close to a root cause. |
I have a root cause! In tmp_objdir_create, we do:
This calls
This breaks strbuf_addf as called by migrate_paths. |
Looks like my PR passed. |
Excellent work!
Excellent! Thank you for your hard work, it is much appreciated! I fast-forwarded to your fix. Now, I am considering the following options:
I am currently leaning toward option 2, without the experimental option (essentially because it would be a lot of work for me)... Thoughts? Did I forget any valid option? |
I don't like the concept of planned changes to the default value of a config option between RCs. If it's been changed before RC0 and feedback from RC0 shows it's causing trouble, changing it back is fine, but releasing a RC with something we intend not to be in the final release seems to go against the concept of release candidates. |
The tmp_objdir API provides the ability to create temporary object directories, but was designed with the goal of having subprocesses access these object stores, followed by the main process migrating objects from it to the main object store or just deleting it. The subprocesses would view it as their primary datastore and write to it. Here we add the tmp_objdir_replace_primary_odb function that replaces the current process's writable "main" object directory with the specified one. The previous main object directory is restored in either tmp_objdir_migrate or tmp_objdir_destroy. For the --remerge-diff usecase, add a new `will_destroy` flag in `struct object_database` to mark ephemeral object databases that do not require fsync durability. Add 'git prune' support for removing temporary object databases, and make sure that they have a name starting with tmp_ and containing an operation-specific name. Based-on-patch-by: Elijah Newren <[email protected]> Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
When creating a subprocess with a temporary ODB, we set the GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not to update refs, since the tmp-objdir may go away. Introduce a similar mechanism for in-process temporary ODBs when we call tmp_objdir_replace_primary_odb. Now both mechanisms set the disable_ref_updates flag on the odb, which is queried by the ref_transaction_prepare function. Note: This change adds an assumption that the state of the_repository is relevant for any ref transaction that might be initiated. Unwinding this assumption should be straightforward by saving the relevant repository to query in the transaction or the ref_store. Peff's test case was invoking ref updates via the cachetextconv setting. That particular code silently does nothing when a ref update is forbidden. See the call to notes_cache_put in fill_textconv where errors are ignored. Reported-by: Jeff King <[email protected]> Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
* ns/tmp-objdir: tmp-objdir: disable ref updates when replacing the primary odb tmp-objdir: new API for creating temporary writable databases
Preparation for adding bulk-fsync to the bulk-checkin.c infrastructure. * Rename 'state' variable to 'bulk_checkin_state', since we will later be adding 'bulk_fsync_state'. This also makes the variable easier to find in the debugger, since the name is more unique. * Move the 'plugged' data member of 'bulk_checkin_state' into a separate static variable. Doing this avoids resetting the variable in finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we seem to unintentionally disable the plugging functionality the first time a new packfile must be created due to packfile size limits. While disabling the plugging state only results in suboptimal behavior for the current code, it would be fatal for the bulk-fsync functionality later in this patch series. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
When adding many objects to a repo with core.fsyncObjectFiles set to true, the cost of fsync'ing each object file can become prohibitive. One major source of the cost of fsync is the implied flush of the hardware writeback cache within the disk drive. Fortunately, Windows, and macOS offer mechanisms to write data from the filesystem page cache without initiating a hardware flush. Linux has the sync_file_range API, which issues a pagecache writeback request reliably after version 5.2. This patch introduces a new 'core.fsyncObjectFiles = batch' option that batches up hardware flushes. It hooks into the bulk-checkin plugging and unplugging functionality and takes advantage of tmp-objdir. When the new mode is enabled do the following for each new object: 1. Create the object in a tmp-objdir. 2. Issue a pagecache writeback request and wait for it to complete. At the end of the entire transaction when unplugging bulk checkin: 1. Issue an fsync against a dummy file to flush the hardware writeback cache, which should by now have processed the tmp-objdir writes. 2. Rename all of the tmp-objdir files to their final names. 3. When updating the index and/or refs, we assume that Git will issue another fsync internal to that operation. This is not the case today, but may be a good extension to those components. On a filesystem with a singular journal that is updated during name operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS we would expect the fsync to trigger a journal writeout so that this sequence is enough to ensure that the user's data is durable by the time the git command returns. This change also updates the macOS code to trigger a real hardware flush via fnctl(fd, F_FULLFSYNC) when fsync_or_die is called. Previously, on macOS there was no guarantee of durability since a simple fsync(2) call does not flush any hardware caches. _Performance numbers_: Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD. Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD. Windows - Same host as Linux, a preview version of Windows 11. This number is from a patch later in the series. Adding 500 files to the repo with 'git add' Times reported in seconds. core.fsyncObjectFiles | Linux | Mac | Windows ----------------------|-------|-------|-------- false | 0.06 | 0.35 | 0.61 true | 1.88 | 11.18 | 2.47 batch | 0.15 | 0.41 | 1.53 Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
This commit adds a win32 implementation for fsync_no_flush that is called git_fsync. The 'NtFlushBuffersFileEx' function being called is available since Windows 8. If the function is not available, we return -1 and Git falls back to doing a full fsync. The operating system is told to flush data only without a hardware flush primitive. A later full fsync will cause the metadata log to be flushed and then the disk cache to be flushed on NTFS and ReFS. Other filesystems will treat this as a full flush operation. I added a new file here for this system call so as not to conflict with downstream changes in the git-for-windows repository related to fscache. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
The update-index functionality is used internally by 'git stash push' to setup the internal stashed commit. This change enables bulk-checkin for update-index infrastructure to speed up adding new objects to the object database by leveraging the pack functionality and the new bulk-fsync functionality. There is some risk with this change, since under batch fsync, the object files will not be available until the update-index is entirely complete. This usage is unlikely, since any tool invoking update-index and expecting to see objects would have to synchronize with the update-index process after passing it a file path. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
The unpack-objects functionality is used by fetch, push, and fast-import to turn the transfered data into object database entries when there are fewer objects than the 'unpacklimit' setting. By enabling bulk-checkin when unpacking objects, we can take advantage of batched fsyncs. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
Add test cases to exercise batch mode for: * 'git add' * 'git stash' * 'git update-index' * 'git unpack-objects' These tests ensure that the added data winds up in the object database. In this change we introduce a new test helper lib-unique-files.sh. The goal of this library is to create a tree of files that have different oids from any other files that may have been created in the current test repo. This helps us avoid missing validation of an object being added due to it already being in the repo. Signed-off-by: Neeraj Singh <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
Switch to batched fsync by default
For years now, Git for Windows turned on the mode where loose object files were synchronized to disk immediately. This was done because several users reported problems when their Windows crashed: there were object files on disk, in the correct location, and having the correct file size, but instead of the zlib-compressed Git objects, the files consisted of NUL bytes.
However, despite my initial findings, the performance impact was actually quite measurable when users started to use larger monorepos.
Happily, @neerajsi-msft implemented a batched fsync mode. Instead of synchronizing immediately after writing the file, we can now choose to synchronize once, after all the loose object files are written by the current Git process.
This strikes a much better balance between safety and speed than the mode that Git for Windows uses.
So let's integrate the patches implementing that mode, and switch to that mode as the new default.