Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serialize bits values and QuoteNodes in IR encoding #45173

Merged
merged 1 commit into from
Aug 17, 2022

Conversation

JeffBezanson
Copy link
Member

Using this instead of the roots array avoids scaling problems in functions with many constants. This might also be better for precompiling. Grows the system image by 1.6%, but is probably worth it. Copying small bits values in code just seems like the right thing.

@JeffBezanson JeffBezanson added the compiler:latency Compiler latency label May 3, 2022
@JeffBezanson
Copy link
Member Author

JeffBezanson commented May 4, 2022

This seems to be a bit crashy? I can reproduce it sometimes but not always. E.g.

LinearAlgebra/addmul  (1) |        started at 2022-05-04T14:47:32.609

signal (11): Segmentation fault
in expression starting at /home/jeff/src/julia/usr/share/julia/stdlib/v1.9/LinearAlgebra/test/addmul.jl:133
symtab_lookup at /home/jeff/src/julia/src/symbol.c:59
_jl_symbol at /home/jeff/src/julia/src/symbol.c:89
ijl_uncompress_argnames at /home/jeff/src/julia/src/ircode.c:980
ijl_uncompress_ir at /home/jeff/src/julia/src/ircode.c:849
InliningTodo at ./compiler/ssair/inlining.jl:921 [inlined]
resolve_todo at ./compiler/ssair/inlining.jl:855
analyze_method! at ./compiler/ssair/inlining.jl:912

or

LinearAlgebra/addmul    (1) |        started at 2022-05-04T14:32:09.368

signal (11): Segmentation fault
in expression starting at /home/jeff/src/julia/usr/share/julia/stdlib/v1.9/LinearAlgebra/test/addmul.jl:133
jl_gc_wb at /home/jeff/src/julia/src/julia.h:911
Allocations: 1114935379 (Pool: 1112683407; Big: 2251972); GC: 1619
Segmentation fault (core dumped)

@JeffBezanson
Copy link
Member Author

JeffBezanson commented Jul 21, 2022

Ok, what happened was this: if a value is not in a method roots list, but is later needed by codegen, codegen will add it to the roots list, but de-duplicating values only by address instead of using ===. That can lead to codegen inserting huge numbers of copies of the same value as method roots. There were also cases where method roots contained QuoteNode(x), but codegen of course wants to emit x, so it would add a root for that as well leading to more duplication.

This PR now fully fixes that situation. Codegen doesn't need integer indices for the values it roots; it just needs them rooted somewhere. So I added a global IdDict of values that might be needed by generated code, so they are de-duplicated by === and there can be lots of sharing. That also avoids adding more entries to the method roots list, which is good because they use O(n) lookup. Plus the logic for getting a globally-rooted version of a value is now centralized. As a result the PR more than pays for itself as the system image is slightly smaller 🎉 .

Some possible future improvements to constant handling:

  • Codegen adds a global root as soon as it sees a value, but it should do it more lazily, only if it really needs a boxed version of the value.
  • CodeInfo could have its own list of constants, since if a constant is generated by compile-time evaluation it is probably specific to a certain specialization.
  • Add special representations for Int8, Int16, UInt16, UInt32, UInt64, and Char to staticdata.c so we can take advantage of those caches as global roots (currently Int32, Int64, and UInt8 are handled).

@JeffBezanson JeffBezanson requested a review from vtjnash July 21, 2022 16:18
src/julia_internal.h Outdated Show resolved Hide resolved
src/staticdata.c Outdated Show resolved Hide resolved
Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

src/ircode.c Outdated Show resolved Hide resolved
- serialize bits values and QuoteNodes in IR encoding
  Using this instead of the `roots` array avoids scaling problems in
  functions with many constants.
- move roots created by codegen to a single global table, allowing
  reuse of egal values and keeping method roots lists shorter
@JeffBezanson JeffBezanson merged commit 425f6ff into master Aug 17, 2022
@JeffBezanson JeffBezanson deleted the jb/irencodebits branch August 17, 2022 14:49
@lassepe
Copy link
Contributor

lassepe commented Aug 17, 2022

Is there any chance that this gets backported to 1.8.x given that it also fixes some bugs we observed in Symbolics.jl/RuntimeGeneratedFunctions or is this considered a latency improvement only and thus waits until 1.9?

@giordano giordano removed the merge me PR is reviewed. Merge when all tests are passing label Aug 22, 2022
@JeffBezanson
Copy link
Member Author

I think this can be backported.

@JeffBezanson JeffBezanson added the backport 1.8 Change should be backported to release-1.8 label Sep 7, 2022
@KristofferC KristofferC mentioned this pull request Sep 16, 2022
28 tasks
KristofferC pushed a commit that referenced this pull request Sep 16, 2022
- serialize bits values and QuoteNodes in IR encoding
  Using this instead of the `roots` array avoids scaling problems in
  functions with many constants.
- move roots created by codegen to a single global table, allowing
  reuse of egal values and keeping method roots lists shorter

(cherry picked from commit 425f6ff)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants