-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
git_add()
slow
#242
Comments
Does the overhead get bigger when adding many files at once? Or is it a fixed overhead of 0.1sec? |
There's a substantial cost per file. With 100 files, gert needs almost 6 seconds: N <- 100
dir <- withr::local_tempdir()
setwd(dir)
gert::git_init()
for (i in 1:N) {
writeLines(character(), paste0(i, ".txt"))
}
system.time(gert::git_add("."))
#> user system elapsed
#> 0.013 0.047 5.499
nrow(gert::git_status())
#> [1] 100
gert::git_commit(message = "Test")
#> [1] "05e7dd6d0077f78dada74864b44866ae8fb8e976"
dir <- withr::local_tempdir()
setwd(dir)
gert::git_init()
for (i in 1:N) {
writeLines(character(), paste0(i, ".txt"))
}
system.time(system2("git", args = c("add", ".")))
#> user system elapsed
#> 0.003 0.007 0.025
nrow(gert::git_status())
#> [1] 100
gert::git_commit(message = "Test")
#> [1] "05e7dd6d0077f78dada74864b44866ae8fb8e976" Created on 2024-11-16 with reprex v2.1.1 |
I truly wonder why the commit hash comes out as the same in the first example. Does the commit hash uses the system time with accuracy to the second only? Same timing (but different commit hashes) if I swap: N <- 100
dir <- withr::local_tempdir()
setwd(dir)
gert::git_init()
for (i in 1:N) {
writeLines(character(), paste0(i, ".txt"))
}
system.time(system2("git", args = c("add", ".")))
#> user system elapsed
#> 0.004 0.007 0.039
nrow(gert::git_status())
#> [1] 100
gert::git_commit(message = "Test")
#> [1] "5eb64df18ee4bb3cc5371faf644833a23e04216c"
dir <- withr::local_tempdir()
setwd(dir)
gert::git_init()
for (i in 1:N) {
writeLines(character(), paste0(i, ".txt"))
}
system.time(gert::git_add("."))
#> user system elapsed
#> 0.011 0.043 5.014
nrow(gert::git_status())
#> [1] 100
gert::git_commit(message = "Test")
#> [1] "933c72ecc0ce1bd962fddbeb55b916e94abc1717" Created on 2024-11-16 with reprex v2.1.1 |
These measurements surprised me:
Created on 2024-11-16 with reprex v2.1.1
Running
git
as an external process is much faster:Created on 2024-11-16 with reprex v2.1.1
git2r seems to have the same problems. What is libgit2 doing there?
The implementation seems to not use the results of
normalizePath()
, and 20 out of the 100 milliseconds are spent ingit_status()
, regardless if the caller cares about the result. Need to understand whyR_git_repository_add
is slow.Created on 2024-11-16 with reprex v2.1.1
The text was updated successfully, but these errors were encountered: