Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ccache to base-builder. #12675

Merged
merged 1 commit into from
Nov 1, 2024
Merged

Add ccache to base-builder. #12675

merged 1 commit into from
Nov 1, 2024

Conversation

oliverchang
Copy link
Collaborator

This installs clang wrappers at /ccache/bin, and sets up a build cache at /ccache/cache. To use this, inside the project container we just need to do:

export PATH=/ccache/bin:$PATH

In another PR, we can store the /ccache/cache somewhere we can pull down at runtime.

Some results:

Fresh compile:

real 0m49.249s
user 10m41.818s
sys 1m2.097s

With ccache cache:

real 0m9.877s
user 0m6.278s
sys 0m19.966s

Fresh compile:

real 1m17.214s
user 0m49.454s
sys 0m27.963s

With ccache:

real 0m34.962s
user 0m18.092s
sys 0m17.083s

This installs clang wrappers at /ccache/bin, and sets up a build cache
at /ccache/cache. To use this, inside the project container we just need
to do:

```
export PATH=/ccache/bin:$PATH
```

In another PR, we can store the /ccache/cache somewhere we can pull down
at runtime.

Some results:

Fresh compile:

real	0m49.249s
user	10m41.818s
sys	1m2.097s

With ccache cache:

real	0m9.877s
user	0m6.278s
sys	0m19.966s

Fresh compile:

real	1m17.214s
user	0m49.454s
sys	0m27.963s

With ccache:

real	0m34.962s
user	0m18.092s
sys	0m17.083s
@oliverchang
Copy link
Collaborator Author

Note: this is likely still complementary to #12608, as a baseline fallback that should always work.

The downside here is that a lot of the bigger projects spend a fair bit of time downloading/configuring dependencies (or doing other weird things) which can't be cached by this mechanism.

An example is poppler, which takes 16 minutes for a clean build. with ccache, this is only reduced down to 11 minutes

Copy link
Contributor

@jonathanmetzman jonathanmetzman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Nice! I suppose the savings will be larger for projects that take longer to compile (probably where most of our time is spent)

@DavidKorczynski
Copy link
Collaborator

DavidKorczynski commented Oct 31, 2024

This is neat! So if I get this right, in this case we wouldn't want to use any cached containers but rather rely on the normal OSS-Fuzz approach and the existing build.sh will run?

If that's the case, then what's the higher-level architecture we'd deploy this in. For example, I think at this point we have multiple viable solutions to the problem: (1) relying on manually generated rebuilder scripts which need a cached container (https://github.com/google/oss-fuzz-gen/tree/main/fuzzer_build_script); (2) using the existing Chronos approach with a cached container; (3) using auto-generated scripts based on #12608 with a cached container; (4) using ccache with original base-images and no cached containers

We need some form of reasoning capability around which process to take or some kind of reasoning around the ordering. The current caching from OFG (1 above) works by using a cached container and overwriting build.sh and then simply running compile. Chronos relies on overwriting compile itself, and 3 above relies on overwriting build.sh and running compile.

The logic for deciding which technique to use could perhaps be put in infra/build/functions/target_experiment.py but am not sure if it's smart to make that more complex.

An alternative is to not have the selection process of which technique to use to be during an actual OFG run, but perhaps asynchronously in some manner. In either case need to address how to auto evaluate if a given solution is actually correctly building the updated harness, while still preserving that we can show failed build errors in the OFG experiments (but not really failed errors due to the issues in cache rebuilding I guess -- or at least it's not the most interesting thing to debug when doing an OFG run).

Considering that this technique uses just the base images + a cache, why not just go ahead and use it by default in all OSS-Fuzz builds?

@DavidKorczynski
Copy link
Collaborator

To answer my own questions above, then I probably think I'd prefer to use the ccache approach as the general approach. I don't think parts like poppler where it downloads a lot is a major issue tbh -- it's such a small part of the the whole OSS-Fuzz set of projects. I'm more concerned with whether ccache works well in the general case, but assuming it does then I think using ccache as perhaps the only solution for faster rebuilding is likely the right decision for now.

@oliverchang
Copy link
Collaborator Author

oliverchang commented Nov 1, 2024

Agreed this approach will serve as a very useful baseline (and I think it will always work in that it shouldn't cause breakages or make anything wrose). We should just start with this.

I think it will still be extremely beneficial to have an approach that would enable rebuilds on the order of seconds for most projects. This will enable LLM applications/agents that need a tighter feedback loop.

I think ultimately, we should have a combination of this as the baseline + a saved container approach where we have an autogenerated recompile script we can optionally run. We shouldn't need to overwrite any existing scripts.

Is there some way we can determine if the auto-generated recompile script will work ahead of time? Perhaps this is something we can precompute.

i.e.

for every OSS-Fuzz project,

  • generate a recompile script using infra: add script to capture replayable commands #12608.
  • build (with ccache cache generated)
  • check if the recompile script works.
    • yes: we throw away the ccache cache since it's unnessary, and keep the recompile script (overriding compile or build.sh).
    • no: we throw away the recompile script, keep the ccache cache, and just inject PATH=/ccache/bin:$PATH into compile
  • push this container

Then, from the user's perspective, they just need to pull the saved container, and run "compile". Under the hood, this could either be using the recompile script, or the ccache cache.

@DavidKorczynski
Copy link
Collaborator

DavidKorczynski commented Nov 1, 2024

Is there some way we can determine if the auto-generated recompile script will work ahead of time? Perhaps this is something we can precompute.

So this is what I was getting at with An alternative is to not have the selection process of which technique to use to be during an actual OFG run, but perhaps asynchronously in some manner.

The main issue we need to ensure is that changes to the source of a harness are applied in the actual build afterwards. Am not sure if we need a definite yes in order to determine if a re-build works, since we would need to know the location of harnesses in order to do this, which may not be something we'd be interested in pulling into this approach.

Alternatively, we could validate if the contents of "OUT" is similar, and if so declare it successful. Am not sure if checksums are too restrictive here, if not then that would be great. Otherwise perhaps simply size checking all executables and ensure the same nameset of executables are in OUT.

Alternatively we could simply say it's successful if a rebuild script didn't crash.

We should then do this on a regular basis? or part of the existing build infra?

The rest of the approach sounds good though.

@oliverchang
Copy link
Collaborator Author

Is there some way we can determine if the auto-generated recompile script will work ahead of time? Perhaps this is something we can precompute.

So this is what I was getting at with An alternative is to not have the selection process of which technique to use to be during an actual OFG run, but perhaps asynchronously in some manner.

The main issue we need to ensure is that changes to the source of a harness are applied in the actual build afterwards. Am not sure if we need a definite yes in order to determine if a re-build works, since we would need to know the location of harnesses in order to do this, which may not be something we'd be interested in pulling into this approach.

Alternatively, we could validate if the contents of "OUT" is similar, and if so declare it successful. Am not sure if checksums are too restrictive here, if not then that would be great. Otherwise perhaps simply size checking all executables and ensure the same nameset of executables are in OUT.

Alternatively we could simply say it's successful if a rebuild script didn't crash.

+1. We can likely get by with a very simple heuristic -- clear binaries in $OUT and check if those filenames come back after calling recompile.

We should then do this on a regular basis? or part of the existing build infra?

+1. I think we can just do this as part of OSS-Fuzz infra.

@oliverchang oliverchang merged commit dd978a4 into master Nov 1, 2024
18 of 19 checks passed
@oliverchang oliverchang deleted the ccache branch November 1, 2024 02:42
@jonathanmetzman
Copy link
Contributor

Careful, I think we need to be smart about how we use this with jcc: https://ccache.dev/manual/3.2.5.html#_using_ccache_with_other_compiler_wrappers

@jonathanmetzman
Copy link
Contributor

I don't think we need it nor would it probably work with our auth situation but mozilla has a version of ccache that can save to cloud storage: https://github.com/mozilla/sccache

@oliverchang
Copy link
Collaborator Author

Careful, I think we need to be smart about how we use this with jcc: https://ccache.dev/manual/3.2.5.html#_using_ccache_with_other_compiler_wrappers

Yep, this is captured here: google/oss-fuzz-gen#682. I think the best way is for ccache to wrap jcc?

I don't think we need it nor would it probably work with our auth situation but mozilla has a version of ccache that can save to cloud storage: https://github.com/mozilla/sccache

Yeah I think we can just save this in the image and push it to the registry to avoid any additional syncing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants