-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[performance] Avoid linear walk of graph children. #2959
Conversation
@corwin-of-amber when this is ready it should also resolve the bad build latency you were suffering in jscoq. |
@ejgallego is there more to do on this PR? Appart from updating the tests. |
I am not sure the current fix is the right one; it was just done as a POC. Note that we are casting from the map to a list often so this PR as-is could introduce performance regressions in other areas. IMHO these castings should be avoided, I can take care of that but it will take me some time as I need to understand well that part of the code; feel free to take over. |
I thought that maybe this doesn't affect asymptotic time complexity if the call sites are already linear-time. Well, it's not clear that they're linear time: the list returned by An easy way to get rid of this potential complexity increase is to keep both the list and the map. |
That's what I was thinking as a first fix; but I didn't go that route as maybe it would be worth to look into this with more detail and avoid such duplication if possible. |
What about implementing |
@diml, yeah, that should work. |
BTW, |
Hi @Armael , after some discussion we were thinking of changing the type of dune/vendor/incremental-cycles/src/incremental_cycles.ml Lines 156 to 175 in 79a27e5
@Armael , do you think that would matter for |
88b4f13
to
a0c68fc
Compare
a0c68fc
to
48c8871
Compare
Ok folks, I've reworked the patch; this could be closer to something we could merge. Main point is maybe what to do with I've reproduced the perf numbers on my own setup [building jsCoq] , would be interesting to understand better the possible performance implications if anyone has a large project. |
Thanks for pinging me. This discussion is in fact related to a shortcoming of the interface provided by More concretely, the current specification for the algorithm requires the Would What I want to implement, and which I believe is the best solution in the long run, is to require I'll have a look, but it will probably take me a bit of time to get through the proofs. In the short term, I can see 3 options:
|
A variation on point 3: you could submit the implementation using |
48c8871
to
a3ecdf9
Compare
Hi @Armael , thanks a lot for your quick turnaround,
I think this is (fortunately) the case in dune master and in this PR, we already have the list of successors at hand, the performance problem happens in a different path due to the way we store the successors.
It seems the main client is indeed |
Ah, you do? Currently, the PR seems to be using
Ah, I see, we could indeed simply represent the sequence by its elimination principle (!), i.e.
|
(if you're reluctant using incremental cycles' repo because it's on inria's gitlab, then we could also discuss moving it to github. I don't mind doing that, it's just a bit annoying.) |
Not in
I think the specification would should say the traversal order is not specified; thus side effects are not safe there.
I have scenarios on the order of 100.000 children [and will grow more] so indeed I'd like to avoid reifying into See #3081 for a first try; indeed INRIA gitlab is hard to use [I used not to be able to submit a pull request there], the change seems pretty simple to me tho, maybe we can have the discussion here and you cherry pick the commit once we converge? I'm OK to submit a pull request there tho. |
PR updated, note that now we pay a bit more of cost when adding a vertex [insertion in the tree vs list cons] we should indeed measure this patch; maybe it is not worth to modify |
02540bd
to
359ef70
Compare
I would also be interested in benchmarks of client-provided-
Well, I can't ;). That sounds like a micro-optimization at the cost of an inconsistent API tbh (note that you can keep using lists for rev deps while using a more generic API) In any case I think it would be good to have some benchmarks. |
I need help w.r.t. dune benchmarking, I'm not sure what the current situation is. |
Oh yes I misread, thanks for the clarification. I can go this route indeed, I didn't do in the first place just to avoid the memory overhead. |
After some discussion with @aalekseyev I have added reverted the interface change and added an extra dep set. Will also add some documentation summarizing the discussion and the invariants of the vendored lib. Note that this PR does modify the complexity of @rgrinberg , maybe this should go into 2.2.1 ? |
I'm not seeing the changes, did you forget to push, or are they somewhere else? |
bef96fe
to
05cf8a5
Compare
Internet is pretty sketchy here, sorry @Armael |
When calling Dune in scenarios where targets have a large number of deps, Dune will take a long time to start. A common case is when depending on `(package coq)`, which brings into the DAG a few thousand files. `perf` data show this is due to the linear walk in `Dag.is_child`; indeed, doing a naive replacement of the list for a more efficient access structure solves the problem: ``` with this PR: real 0m1,684s user 0m1,552s sys 0m0,128s with master: real 0m11,450s user 0m10,587s sys 0m0,264s ``` We fix this by adding an efficient representation of `deps` that allows checking if an edge is already in the graph `log n` time, so the complexity of `is_child` goes from O(n²) to O(n log(n)). Note that `raw_add_edge` has also changed complexity from O(1) to O(log n) due to extra map insertion. Signed-off-by: Emilio Jesus Gallego Arias <[email protected]>
Signed-off-by: Emilio Jesus Gallego Arias <[email protected]> Co-authored-by: Arseniy Alekseyev <[email protected]> Co-authored-by: Armaël Guéneau <[email protected]>
05cf8a5
to
8a2812a
Compare
Thanks, the new version looks good to me. Wrt the complexity of |
Thanks a lot for all the help @Armael , I propose @aalekseyev gives this a last review and merges if he thinks we have converged. @aalekseyev , I tried to summarize some of the discussion in the README, please amend / let me know if that's enough. |
Signed-off-by: Arseniy Alekseyev <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks for the detailed readme!
Signed-off-by: Arseniy Alekseyev <[email protected]>
I moved the code that checks the presence of the edge into a new function |
Looks good, thanks @aalekseyev ; I wonder if we should also rename
|
Signed-off-by: Arseniy Alekseyev <[email protected]>
I might just delete |
Sounds pretty reasonable I'd say. |
Signed-off-by: Arseniy Alekseyev <[email protected]>
Signed-off-by: Arseniy Alekseyev <[email protected]>
Did you run the Coq benchmark on the version with the list+set? I'm happy to merge without that because it's clearly going to be similar, but I'd like to reword the commit message if we didn't measure this particular version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thank you!
Similar numbers [on battery now]
|
…lugin, dune-private-libs and dune-glob (2.3.0) CHANGES: - Improve validation and error handling of arguments to `dune init` (ocaml/dune#3103, fixes ocaml/dune#3046, @shonfeder) - `dune init exec NAME` now uses the `NAME` argument for private modules (ocaml/dune#3103, fixes ocaml/dune#3088, @shonfeder) - Avoid linear walk to detect children, this should greatly improve performance when a target has a large number of dependencies (ocaml/dune#2959, @ejgallego, @aalekseyev, @Armael) - [coq] Add `(boot)` option to `(coq.theories)` to enable bootstrap of Coq's stdlib (ocaml/dune#3096, @ejgallego) - [coq] Deprecate `public_name` field in favour of `package` (ocaml/dune#2087, @ejgallego) - Better error reporting for "data only" and "vendored" dirs. Using these with anything else than a strict subdirectory or `*` will raise an error. The previous behavior was to just do nothing (ocaml/dune#3056, fixes ocaml/dune#3019, @voodoos) - Fix bootstrap on bytecode only switches on windows or where `-j1` is set. (ocaml/dune#3112, @xclerc, @rgrinberg) - Allow `enabled_if` fields in `executable(s)` stanzas (ocaml/dune#3137, fixes ocaml/dune#1690 @voodoos) - Do not fail if `ocamldep`, `ocamlmklib`, or `ocaml` are absent. Wait for them to be used to fail (ocaml/dune#3138, @rgrinberg) - Introduce a `strict_package_deps` mode that verifies that dependencies between packages in the workspace are specified correctly. (@rgrinberg, ocaml/dune#3117)
…lugin, dune-private-libs and dune-glob (2.3.0) CHANGES: - Improve validation and error handling of arguments to `dune init` (ocaml/dune#3103, fixes ocaml/dune#3046, @shonfeder) - `dune init exec NAME` now uses the `NAME` argument for private modules (ocaml/dune#3103, fixes ocaml/dune#3088, @shonfeder) - Avoid linear walk to detect children, this should greatly improve performance when a target has a large number of dependencies (ocaml/dune#2959, @ejgallego, @aalekseyev, @Armael) - [coq] Add `(boot)` option to `(coq.theories)` to enable bootstrap of Coq's stdlib (ocaml/dune#3096, @ejgallego) - [coq] Deprecate `public_name` field in favour of `package` (ocaml/dune#2087, @ejgallego) - Better error reporting for "data only" and "vendored" dirs. Using these with anything else than a strict subdirectory or `*` will raise an error. The previous behavior was to just do nothing (ocaml/dune#3056, fixes ocaml/dune#3019, @voodoos) - Fix bootstrap on bytecode only switches on windows or where `-j1` is set. (ocaml/dune#3112, @xclerc, @rgrinberg) - Allow `enabled_if` fields in `executable(s)` stanzas (ocaml/dune#3137, fixes ocaml/dune#1690 @voodoos) - Do not fail if `ocamldep`, `ocamlmklib`, or `ocaml` are absent. Wait for them to be used to fail (ocaml/dune#3138, @rgrinberg) - Introduce a `strict_package_deps` mode that verifies that dependencies between packages in the workspace are specified correctly. (@rgrinberg, ocaml/dune#3117) - Make sure the `@all` alias is defined when no `dune` file is present in a directory (ocaml/dune#2946, fix ocaml/dune#2927, @diml)
…lugin, dune-private-libs and dune-glob (2.3.0) CHANGES: - Improve validation and error handling of arguments to `dune init` (ocaml/dune#3103, fixes ocaml/dune#3046, @shonfeder) - `dune init exec NAME` now uses the `NAME` argument for private modules (ocaml/dune#3103, fixes ocaml/dune#3088, @shonfeder) - Avoid linear walk to detect children, this should greatly improve performance when a target has a large number of dependencies (ocaml/dune#2959, @ejgallego, @aalekseyev, @Armael) - [coq] Add `(boot)` option to `(coq.theories)` to enable bootstrap of Coq's stdlib (ocaml/dune#3096, @ejgallego) - [coq] Deprecate `public_name` field in favour of `package` (ocaml/dune#2087, @ejgallego) - Better error reporting for "data only" and "vendored" dirs. Using these with anything else than a strict subdirectory or `*` will raise an error. The previous behavior was to just do nothing (ocaml/dune#3056, fixes ocaml/dune#3019, @voodoos) - Fix bootstrap on bytecode only switches on windows or where `-j1` is set. (ocaml/dune#3112, @xclerc, @rgrinberg) - Allow `enabled_if` fields in `executable(s)` stanzas (ocaml/dune#3137, fixes ocaml/dune#1690 @voodoos) - Do not fail if `ocamldep`, `ocamlmklib`, or `ocaml` are absent. Wait for them to be used to fail (ocaml/dune#3138, @rgrinberg) - Introduce a `strict_package_deps` mode that verifies that dependencies between packages in the workspace are specified correctly. (@rgrinberg, ocaml/dune#3117) - Make sure the `@all` alias is defined when no `dune` file is present in a directory (ocaml/dune#2946, fix ocaml/dune#2927, @diml)
When calling Dune in scenarios where targets have a large number of
deps, Dune will take a long time to start. A common case is when
depending on
(package coq)
, which brings into the DAG a few thousandfiles as children of a single node.
perf
data show this is due to the linear walk inDag.is_child
;indeed, doing a naive replacement of the list for a more efficient
access structure solves the problem. For example a dummy target that depends on
(package coq)
does:This PR is just a proof-of-concept and it must not be merged yet; I've
opened it to track the issue and to devise what would the best
solution be.
As far as I can see there are several options, I am not familiar
enough with the cycle detection algo as to propose something now.