Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a flag to remain persistent. #1389

Closed
wants to merge 1 commit into from
Closed

Conversation

jmgao
Copy link
Contributor

@jmgao jmgao commented Feb 20, 2018

Large projects (e.g. Android, with its 1GB+ ninjafiles) can take
multiple seconds to parse ninjafiles, leading to a very painful
edit-compile-test cycle. Add a flag that causes ninja to remain
persistent and run a build upon receiving input to stdin, speeding up
a no-op incremental build of a directory in Android from 10 seconds to
under 200 milliseconds.

Large projects (e.g. Android, with its 1GB+ ninjafiles) can take
multiple seconds to parse ninjafiles, leading to a very painful
edit-compile-test cycle. Add a flag that causes ninja to remain
persistent and run a build upon receiving input to stdin, speeding up
a no-op incremental build of a directory in Android from 10 seconds to
under 200 milliseconds.
@jmgao
Copy link
Contributor Author

jmgao commented Feb 21, 2018

(in the off chance this is actually acceptable for upstream and not just something that should be done in a local hack, there should be an assertion that the line read in by getline is empty, to allow for future extensibility without breaking backward compat)

@evmar
Copy link
Collaborator

evmar commented Feb 21, 2018

How did you get 1gb ninja files? At even 1kb of text per file (which is 10-20 lines of text per file) that's still a million build steps?

@jmgao
Copy link
Contributor Author

jmgao commented Feb 21, 2018

@colincross and @danw can provide more context, but some cursory inspection says that 1kB per file is probably conservative. grepping for an arbitrary file finds a rule/build combo that's 3kB.

We seem to have some especially huge lines, there's one that's 5 MB, four that are 2.8 MB, and four more that are around 1 MB. They seem to mostly be giant conglomeration rules like "build all of the javadocs" or "build all of the native sources" or "build the dialer apk".

@jmgao
Copy link
Contributor Author

jmgao commented Feb 21, 2018

(Also, it looks like we have about half a million build steps when targeting an arbitrarily selected device. There's a bit of combinatorial explosion with resources, but paging through the list, most of them seem legit. We just have way too much source...)

@evmar
Copy link
Collaborator

evmar commented Feb 21, 2018 via email

@danw
Copy link
Contributor

danw commented Feb 21, 2018

For our internal "master" tree, building for the Pixel 2 (taimen) device, our ninja files now add up to 1.5GB. Some stats on that configuration:

    603k     build statements
    135k 22% phony build statements
    849k     nodes
    625k 73% nodes that are outputs of build statements
    223k 26% input-only nodes

    105k actions run in a "normal" local build
    196k actions run in a build server build (building more tests and other artifacts)

We're seeing 10+ seconds of ninja-related startup time, last I checked this was largely due to:

  • Path -> Node* lookups (lots and lots of them)
  • Parsing/allocation of EvalStrings
  • Loading of the ninja_deps/ninja_log files. These can reach ~100MB, and we're running ninja with the same ninja directory 3 times (bootstrapping the generator).

The first two are fairly optimized code paths already for what they do, and my only significantly successful experiment has been to defer some of the work until we know that we really need it: https://android-review.googlesource.com/c/platform/external/ninja/+/461005

Realistically, for this scale, I've been looking at a binary file format -- precomputing as much as possible and using gzip'd string tables instead of variable substitution. That's actually where some of the stats above came from -- it brings the file size down to 104MB, and early testing with a multithreaded go implementation is showing <1/4 the time to do a similar set of tasks to ninja.

On the extreme side, I've gotten our generator to produce an 8.5GB ninja file when trying to produce a single build that builds many configurations at once, but that build isn't particularly realistic on most machines anymore.

@evmar
Copy link
Collaborator

evmar commented Feb 21, 2018

(I don't mean to derail your larger thrust here, just wanted to point out some smaller options that might help a bit.)

  1. One of the reasons we have the $variable support is to allow for some refactoring of input files -- like if a bunch of targets all have the same -I flags you could share them. I appreciate that maybe that isn't convenient for you depending on how your generator code is laid out, but the snippet you showed (where it was a separate rule per compile command) would significantly bloat the file.

A more common pattern is something like:

cflags = cflags used by most code
rule cc
  command = clang++ $cflags etc
build exceptionaltarget: cc inputs
  cflags = special cflags used by this target

to factor the long flags lines together.

Ultimately of course all of these strings must be expanded into full command lines in memory, but if the bottleneck is textual input -> in-memory representation then that would probably help.

  1. A rule per target is surely constructing a lot of extra unneeded data structures. You could even maybe do something like
rule arbitrarycommand
  deps = gcc
  command = $cmd

build whatever.o: arbitrarycommand morestuff
  desc = ...
  cmd = lots of stuff here

to factor out the construction of a lot of intermediate Rule objects and share the "deps = gcc" line repeatedly.

@colincross
Copy link
Contributor

We mix the results of two generators, blueprint and kati. Blueprint was designed around ninja, and creates sensible ninja files with only a few rules and repeated cflags in variables. Kati is converting makefiles to ninja, and produces a much cruder ninja file with a rule per build statement and no use of variables (not even $in and $out), which results in all the input files being listed in both the rule and build statements.

@danw
Copy link
Contributor

danw commented Feb 22, 2018

While variables reduce the file size, that's not necessarily a win for loading time, expanding the variables isn't particularly cheap. Keeping them unique & human-understandable also leads to some of our variable names being longer than the value that they represent :(

From the 1.3GB -> 1.5GB jump in the last week, we did notice that we're duplicating some dependencies that weren't strictly necessary. So once my patches land, we'll be back to "only" 1GB :)

@jhasse
Copy link
Collaborator

jhasse commented Apr 10, 2019

Closing in favor of #1438.

@jhasse jhasse closed this Apr 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants