Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shallow cloning on travis #1372

Merged
merged 1 commit into from
Aug 16, 2021
Merged

Shallow cloning on travis #1372

merged 1 commit into from
Aug 16, 2021

Conversation

chenzhang22
Copy link
Contributor

According to https://docs.travis-ci.com/user/customizing-the-build#git-clone-depth, Travis CI provide a way to shallow clone a repository. This has the obvious benefit of speed, since you only need to download a small number of commits.

@dbwiddis
Copy link
Contributor

Thanks for your suggestion.

A few notes:

  • For context, we've moved all our CI off Travis onto Github Actions with the exception of a single job on ARM hardware. And our long term average Travis PR build rate is on the order of 1 build per day. We're not really optimizing that much.
    • Our builds take a bit over 2 minutes on average
    • Our Github Actions builds typically take longer. We're never waiting on the Travis result
    • This commit shows a potential improvement of about 10 seconds (14.4s to 3.7s)
  • A truly shallow clone (depth 1) is indeed the fastest but you've suggested a depth of 3
    • While this may be faster from Travis's perspective, it causes more work on Github's end. There's typically already an optimized pack file for a full clone ready to send. But setting any custom history value greater than 1 requires the server to recompute that history to generate a new pack file.
    • Benchmarks suggest that while clone is 4x faster (matching our experience) that's typically due to network/filesystem, but the actual shallow+fetch cloning work doubles the CPU usage, adding on a dozen or two milliseconds of CPU to the build.
      • Which one is friendlier to Travis resources?
      • I'd argue that slower builds (that still finish before other CI) that use prepackaged pack files and consume less CPU to extract are better use of cloud CPU resources at the cost of a little extra network IO. CI Build systems like Travis are more often CPU-limited than they are bandwidth-limited.
  • A depth of 3 is probably insufficient. Our build script relies on Travis's $TRAVIS_COMMIT_RANGE variable to skip builds that are just documentation. If a user submits a PR that internally contains dozen commits, this variable wouldn't work.
    • It's not uncommon to have dozens of commits on a PR.
    • To be fair, it's probably not a great loss if the last 3 commits to a PR didn't change anything but docs, so maybe this is a superfluous argument.

So, I don't really care that much either way.

@chenzhang22
Copy link
Contributor Author

Thank you, Mr. Widdis. I learned a lot from your reply.
I'm doing a research on Travis CI performance problems and hope to get some feedback from the developers. So I open this pull request.
I suggested a depth of 3 because I thought that depth 1 may cause problems when push too fast. And the example in official document also uses depth 3, I thought it may be a good choice.

@dbwiddis
Copy link
Contributor

Looking at git blame I see I was the one who suggested disabling shallow cloning (it was the original default 50):
#1199 (comment)

My reasoning then had been an experience I had getting CI to run on a commit to a branch that was several hundred commits behind master. Disabling shallow clone fixed it. I don't know enough about the technical details to know if that was a coincidence or not.

@chenzhang22
Copy link
Contributor Author

@dbwiddis
Copy link
Contributor

You've got a similar PR on my project so I'm replying twice. I looked at my own history and saw my use of full clone was intentional to support blame information on SonarQube, which was at the time triggered from my Travis build. See https://docs.sonarqube.org/latest/analysis/scm-integration/

We don't trigger any external CI from Travis so this isn't a concern and it's probably reasonable to change it. The time I've spent thinking about this will far exceed the multiples of 10 seconds we'll save, though. :)

@dbwiddis
Copy link
Contributor

Summarizing my thoughts:

  • This PR doesn't hurt, can help, and I'm inclined to approve
  • I'd probably rather remove the entire configuration which goes to the Travis default (50). There's probably not much difference; the main benefit is only pulling a single branch, and more would be robust to situations like late 2020 with a long build queue and multiple commits building up before they executed.

@dbwiddis
Copy link
Contributor

@YunLemon please provide a real name and email address contact on the commit, it currently shows as Author: Your Name <[email protected]>.

@chenzhang22
Copy link
Contributor Author

CI builds will still run multiple times...and will save multiples of 10 seconds. ^_^
So, I should remove entire git configuration now?

@dbwiddis
Copy link
Contributor

Yes. Just remove the config so it uses the default; use a real name and email in your git config; and squash your commits. I'll merge tomorrow if nobody else has any reason not to.

@dbwiddis dbwiddis merged commit b7f9708 into java-native-access:master Aug 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants