Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better GMP integration with the GC #11202

Open
carnaval opened this issue May 8, 2015 · 16 comments
Open

Better GMP integration with the GC #11202

carnaval opened this issue May 8, 2015 · 16 comments
Labels
bignums BigInt and BigFloat GC Garbage collector performance Must go faster

Comments

@carnaval
Copy link
Contributor

carnaval commented May 8, 2015

We discussed this a few times so better file an issue in case anyone wants to go ahead.

For now our GMP integration allocates the dynamic memory holding the integer using the libc allocator. It then adds a finalizer for cleanup when the parent object gets GCd.

This is inefficient, mainly because finalizers themselves are, and because the GC can't know of much memory will be freed by a collection before running the finalizers, which makes some heuristics bad. The best I can come up with for doing this would be to allow julia datatypes to have fields of type buff_t, like the hidden data pointer in an Array. Those fields points to gc memory which is not tagged but can have variable length. The BigInt type could then declare the _mp_d field (https://gmplib.org/manual/Integer-Internals.html) as a buff_t and allocate it using allocb. This should not require modification inside GMP. It would however need to special case fields of buffer type in the DataType marking code in the gc.

I don't know much about GMP so maybe there is a better solution. Also this would probably work for MPFR and maybe other C libraries we are wrapping using finalizers.

@simonster
Copy link
Member

Ref #4176

@ScottPJones
Copy link
Contributor

This sounds like a very interesting issue... esp. since I can see it being an issue with decimal float support (which I very much need)... so is #4176

@carnaval
Copy link
Contributor Author

carnaval commented May 8, 2015

@ScottPJones If the C type you are wrapping is fixed size the best way is likely to embed it directly inside a julia DataType, maybe as the struct itself or as a bitstype field. That way you avoid pointers and some gc overhead (especially if the type is immutable).

@carnaval
Copy link
Contributor Author

carnaval commented May 8, 2015

(our DataTypes respect the C struct layout so you should be able to copy the definition over)

@timholy
Copy link
Member

timholy commented May 8, 2015

Also worth linking #10084.

@carnaval
Copy link
Contributor Author

carnaval commented May 8, 2015

Oh god. Forgot about this. I need a snooze button for github issues.

@StefanKarpinski
Copy link
Member

A related idea would be making BigFloat a parametric type where the number of bits is part of the type, and dictates its representation, so that they could be stack allocated. @JeffBezanson doesn't care for this since it would potentially lead to a ton of code generation, but I suspect that in practice people don't use that many different sizes of BigFloat, so it might actually be ok.

@JeffBezanson
Copy link
Member

Maybe we could use a tuple of Limbs instead of a buff_t? But do we need a write barrier when GMP assigns the _mp_d field to the new space?

@carnaval
Copy link
Contributor Author

carnaval commented May 9, 2015

Oh of course Tuple{Vararg{UInt8}} is a much better type for the limb vector now ! I like this, no need to touch gc.c. We do need a barrier in any approach of this but we can just call it after every gmp calls it's quite cheap (and no need to patch gmp this way). So the roadmap for this now looks like :

  • change the gmp_alloc functions to something allocating a tuple and returning a pointer to the data
  • add a builtin/ccall for explicit barriers from julia code + 3 lines of custom codegen calling emit_write_barrier
  • add explicit barrier calls after every potentially reallocating gmp ccalls

Does that seem right ?

@JeffBezanson
Copy link
Member

Sounds right to me. Putting the barriers after every gmp call is kind of annoying; I guess we can wrap it in a macro.

@tkelman
Copy link
Contributor

tkelman commented May 10, 2015

I like the sound of that because, and correct me if I'm wrong, packages could also plug into this approach without having to special-case certain types in base, right? So BigDecimal or what have you could be worked on independently without needing code changes in the gc.

@ScottPJones
Copy link
Contributor

👍 @tkelman (just what I need!)

@carnaval
Copy link
Contributor Author

@tkelman yup. It does need some changes in the C code the first time however.

@JeffBezanson JeffBezanson added the performance Must go faster label May 25, 2015
@JeffBezanson
Copy link
Member

An idle thought: we could define

immutable BigInt <: Integer
    data::Tuple{Vararg{Limb}}
    neg::Bool
end

so BigInts are compared correctly by ===. Then when calling GMP, we can stack allocate structs with the right representation. GMP alloc functions can allocate tuples as discussed above. Then to return, we pick the tuple out of the struct and wrap it in a new BigInt.

@stevenhao
Copy link
Contributor

stevenhao commented Apr 14, 2016

An issue with write barriers: We cannot add a write barrier every time GMP makes a memory call, because jl_gc_wb takes a parent pointer, whereas GMP's memory functions do not pass a parent pointer.

One solution is to put the barrier on the parent before/after every ccall to GMP, but this is dangerous because the GC may run multiple times before the ccall terminates.

Gist (implements using Tuple objects in the GMP memory functions), if anyone wants to pick up off this: https://gist.github.com/stevenhao/5438681b883e36d15e9565c9f51f74e1

@StefanKarpinski StefanKarpinski added this to the 1.0 milestone Aug 26, 2016
@JeffBezanson JeffBezanson modified the milestones: 1.x, 1.0 May 10, 2017
@vtjnash
Copy link
Member

vtjnash commented Jan 18, 2022

Yes, they are related. There has not been any activity on it.

@stevengj stevengj added the bignums BigInt and BigFloat label Mar 8, 2022
@ViralBShah ViralBShah added the GC Garbage collector label Mar 12, 2022
@DilumAluthge DilumAluthge removed this from the 1.x milestone Mar 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bignums BigInt and BigFloat GC Garbage collector performance Must go faster
Projects
None yet
Development

No branches or pull requests