Skip to content
This repository has been archived by the owner on Oct 12, 2022. It is now read-only.

Another precise GC implementation #2418

Merged
merged 2 commits into from
Jan 17, 2019

Conversation

rainers
Copy link
Member

@rainers rainers commented Dec 27, 2018

This is a variation of #1603, but with less changes to the GC interface.

It works without adding an emplace function and new BlkAttr flags. Instead, it assumes knowledge about the implementation of dynamic arrays. The APPENDABLE attribute tells the GC to repeat the pointer bitmap given by the elements type info. In addition, a memory block in the large object pool with APPENDABLE set is assumed to contain data starting at offset 16.

There is no longer a bit reserved for each possible memory location for data in the large object pool. Instead the rtInfo of its type is saved for every page, so allocation has almost no overhead in that case. This also allows restricting scanning to the actual size of the type.

In case you are running the conservative GC, the overhead of having the option to switch to the precise GC is hardly noticable:

  • a bool check when creating an AA (it has to build the (key,value)-pair type info at runtime for precise information (this already happens if either has a destructor)
  • a bool check per allocation
  • a bool check during collection
  • some additional bytes for each type info containing the pointer bitmap (identical bitmaps merged).

The checks in the GC could be omitted by making the GC a template with preciseness as a parameter. That duplicates a lot of code, though. I'd reconsider that after #1924 is available so you can easily opt-in/out at link time.

In general, the precise version still has druntime/benchmark results 0-10% slower than the conservative GC on Win64, the difference is smaller for Win32. The tests don't produce many false pointers, though.

@rainers rainers added the GC garbage collector label Dec 27, 2018
@dlang-bot
Copy link
Contributor

Thanks for your pull request, @rainers!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + druntime#2418"

src/core/bitop.d Outdated Show resolved Hide resolved
src/gc/config.d Outdated
@@ -16,7 +16,7 @@ struct Config
{
bool disable; // start disabled
ubyte profile; // enable profiling with summary when terminating program
string gc = "conservative"; // select gc implementation conservative|manual
string gc = "precise"; // select gc implementation conservative|precise|manual
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to change that default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but I want the tests to run with the precise version. It should be reverted to "conservative" before merging.

src/gc/bits.d Outdated Show resolved Hide resolved
src/rt/typeinfo/ti_cdouble.d Outdated Show resolved Hide resolved
src/rt/dmain2.d Outdated
@@ -65,11 +65,12 @@ extern (C) void _d_monitor_staticctor();
extern (C) void _d_monitor_staticdtor();
extern (C) void _d_critical_init();
extern (C) void _d_critical_term();
extern (C) void gc_init();
extern (C) void gc_config() @nogc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really need to change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AA initialization reads the configuration once, but as the GC is initialized lazily, its configuration needs to be updated early. I don't like checking for "precise" in aaa_init(), though. It should be something other implementations can also configure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative we could also opt to generate the RTInfo of the key/value pair unconditionally (maybe with some optimization for no pointers), in the hope that the compiler will generate the proper typeinfo eventually.

@rainers
Copy link
Member Author

rainers commented Dec 28, 2018

With respect to splitting it up: seperate PRs don't make much sense without the GC, so I'm a bit hesitant. I can make them seperate commits, though.

@schveiguy
Copy link
Member

seperate PRs don't make much sense without the GC

The bitrange changes can stand alone.

@rainers
Copy link
Member Author

rainers commented Dec 28, 2018

The bitrange changes can stand alone.

Actually I thought the rtInfo changes could be made seperate as they might have an influence on other stuff, but the bit range functions are never called without the GC. I'll move both to seperate PRs to reduce the size of the changes.

@rainers
Copy link
Member Author

rainers commented Dec 30, 2018

rebased and added a test case to actually check skipping false pointers. I hope it's not going to be a cause for spurious failures as there is no guarantee no other false pointers exist.

@@ -243,6 +247,39 @@ debug (LOGGING)

/* ============================ GC =============================== */


debug(PRINTF)
void printGCBits(GCBits* bits)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these and other debug functions (LOGGING/PRINTF) should go to the bottom of the file, but that's for another PR.

@rainers rainers force-pushed the gc_precise_appendable branch 5 times, most recently from d893479 to a0fb48d Compare January 3, 2019 08:21
@@ -36,7 +36,7 @@ struct Config
string s = "GC options are specified as whitespace separated assignments:
disable:0|1 - start disabled (%d)
profile:0|1|2 - enable profiling with summary when terminating program (%d)
gc:conservative|manual - select gc implementation (default = conservative)
gc:conservative|precise|manual - select gc implementation (default = conservative)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a reminder to update the spec too (or at least open up a PR, s.t. it doesn't get lost once this PR has been merged): https://github.com/dlang/dlang.org/blob/master/spec/garbage.dd#L386

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcioapm
Copy link
Contributor

What is the impact? Would be nice to get some impact estimates on the changelog.

@wilzbach
Copy link
Member

wilzbach commented Jan 7, 2019

What is the impact? Would be nice to get some impact estimates on the changelog.

From @rainers's initial comment:

In general, the precise version still has druntime/benchmark results 0-10% slower than the conservative GC on Win64, the difference is smaller for Win32. The tests don't produce many false pointers, though.

@rainers
Copy link
Member Author

rainers commented Jan 12, 2019

BTW: the failure on buildkite is that the ocean library needs a tiny patch sociomantic-tsunami/ocean#663 that is already merged, but not yet available in a tagged version. As the default GC will be reverted to 'conservative' that shouldn't be a blocker.

The precisegc test can be a source of spurious errors though due to false pointers on the stack or in registers. Maybe it should be changed to just not crash?

@rainers
Copy link
Member Author

rainers commented Jan 14, 2019

The precisegc test can be a source of spurious errors though due to false pointers on the stack or in registers. Maybe it should be changed to just not crash?

The test now only produces some output instead of failing. It still verifies that not more objects are collected than expected.

I've also restored the conservative GC as the default now.

@thewilsonator
Copy link
Contributor

Let me know when this is good to go.

@rainers
Copy link
Member Author

rainers commented Jan 16, 2019

I've made a (hopefully) last change: the dependency of aaa.d on gc.config.gc == "precise" was a bit ugly, so I changed fakeEntryTI to always generate the proper RTInfo for the key/value pair struct:

  • it is now available for other GC implementations, too
  • there is a lot happening when Impl is being created, the TypeInfo entry only adds a bit extra. It might be created a bit more often with the conservative GC, but is now optimized to be created less often for the precise GC
  • it is no longer limited to small key/value pairs (512 bytes for 64-bit process, 128 bytes for 32-bit process), but generates proper RTInfo for any size.
  • it has some direct unittests for proper RTInfo generation
  • long term, the TypeInfo for the key/value pair struct should be generated by the compiler and added to TypeInfoAssocArray, making all this runtime generation obsolete.

For testing purposes, the default GC is back to "precise", I'll revert it once tests have passed.

@rainers
Copy link
Member Author

rainers commented Jan 17, 2019

Let me know when this is good to go.

Tests passed (but ocean as expected), so I reverted to "conservative". I squashed commits to remove temporary changes. Good to go IMO.

@dlang-bot dlang-bot merged commit c86e4a0 into dlang:master Jan 17, 2019
@rainers
Copy link
Member Author

rainers commented Jan 17, 2019

Hooray! It's just been about 6 years since its first incarnation... ;-) The first PR was a bit later, though: #1022

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
GC garbage collector
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants