Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some GASMAN resp. "weak pointer object" cleanup #2485

Merged
merged 7 commits into from
May 25, 2018

Conversation

fingolfin
Copy link
Member

These are a byproduct of work on #2092

Ultimately, I'd like to remove any reference to GASMAN, Boehm GC etc. from gasman.h, rename it to bags.h, and then rename gasman_internal.h to gasman.h.

@fingolfin fingolfin added topic: kernel release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes labels May 22, 2018
@codecov
Copy link

codecov bot commented May 22, 2018

Codecov Report

Merging #2485 into master will decrease coverage by <.01%.
The diff coverage is 94.44%.

@@            Coverage Diff             @@
##           master    #2485      +/-   ##
==========================================
- Coverage    74.2%   74.19%   -0.01%     
==========================================
  Files         484      484              
  Lines      245311   245320       +9     
==========================================
- Hits       182021   182018       -3     
- Misses      63290    63302      +12
Impacted Files Coverage Δ
src/opers.c 94.01% <ø> (-0.54%) ⬇️
src/calls.c 94.67% <ø> (ø) ⬆️
src/saveload.c 65.44% <ø> (ø) ⬆️
src/gasman.h 93.1% <ø> (ø) ⬆️
src/plist.h 98.61% <ø> (ø) ⬆️
src/gap.c 85.56% <ø> (+0.06%) ⬆️
src/gasman.c 85.86% <ø> (ø) ⬆️
src/plist.c 93.92% <100%> (-0.01%) ⬇️
src/weakptr.c 95.07% <94.11%> (-0.04%) ⬇️
hpcgap/lib/hpc/stdtasks.g 64.19% <0%> (-0.52%) ⬇️
... and 3 more


#ifdef USE_GASMAN

#define IS_WEAK_DEAD_BAG(bag) ( (((UInt)bag & (sizeof(Bag)-1)) == 0) && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a very low-level gasman thing to have in weakptr.h?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure we are on the same page: It's in weakptr.c, not in weakptr.h -- though before, it was indeed in a header file, namely in gasman.h.

tl;dr: yes, this is not ideal, but it's only meant to be temporary, so I hope we can let it slide for now...

Long version: Yes, this is rather low level GASMAN specific code -- but then so is a lot of other code in weakptr.c, you'll see lots of code specific to Boehm resp. GASMAN -- and with Julia GC, also code specific to that. IMHO this shows that the abstraction originally chosen to allow implementing weak pointer objects "without assuming too much about the GC internals" just does not work. (To be fair, I don't actually think that was a major concern when weak pointer objects were implemented, so it's of course natural that this non-goal was not achieved ;-).

So I think on the long run, we should consider refactoring this. But for now, it was simply convenient to hide the implementation specific parts somewhere, and here it seemed useful (for people reading the code, and trying to understand it) to have it in weakptr.c. But I could also put it into gasman_intern.h . The main reason I did not do this (yet!) is that there are also Boehm version of this macro; so then one might have to add a boehm_gc.h (which I think we'll do on the long run anyway). Anyway, this seemed to be more work to me, and perhaps a bit premature: I'd rather see how the Julia GC integration works out, and then see if we can come up with a better abstraction...

@@ -234,22 +235,16 @@ Int LengthWPObj(Obj wp)
return len;
#endif

#ifndef USE_BOEHM_GC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do this? The logic here has been changed quite confusingly. Before I try tracing it out, have you changed anything, or just moved things around?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code was changed in a non-trivial way, though it should be 100% identical in behavior to the original.

The reason I did this was to make the code easier to understand -- at least to me, it was very difficult to grok the original. Case in point: it turns out that the GASMAN and Boehm code are really doing the same thing, it just was difficult to see.

To elaborate, before the change, this code looked like this:

#ifndef USE_BOEHM_GC
  Obj elm;
  while (len > 0 && 
         (!(elm = ELM_WPOBJ(wp,len)) ||
          IS_WEAK_DEAD_BAG(elm))) {
    changed = 1;
    if (elm)
      ELM_WPOBJ(wp,len) = 0;
    len--;
  }
#else
  while (len > 0 && !ELM_WPOBJ(wp, len)) {
    changed = 1;
    len--;
  }
#endif

Can you see that the Boehm and GASMAN code are doing exactly the same thing? I couldn't...

Now, I am already wary of simple assignments in loop conditionals, but if they look like while ((elm = ELM_PLIST(list, n) { ... I'd accept them, as it's reasonable easy to discern the meaning. But here the assignment was nested into a complex logical expression. Tracing out what it did was IMHO non-trivial. So I refactored it: Essentially, change the loop to while (1), then move the condition into the loop body as if (!cond) break;, then apply boolean logic to transform and the split the condition... The result is the new loop:

  while (len > 0) {
    elm = ELM_WPOBJ(wp, len);
    if (elm && !IS_WEAK_DEAD_BAG(elm))
        break;
    changed = 1;
    if (elm)
      ELM_WPOBJ(wp, len) = 0;
    len--;
  }

Now you can compare this to the Boehm GC code, and realize the following: there is an additional IS_WEAK_DEAD_BAG, but for Boehm GC, that does nothing. So we abort the loop via break as soon as elm != 0. But in that case, the second if never is reached; or in other words, whenever it is reached, we'll have elm ==0. So the Boehm version is really equivalent. And in fact, the compiler notices that and optimize the second if statement away.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight correction: with Boehm GC, IS_WEAK_DEAD_BAG still does something, but it's this:

#define IS_WEAK_DEAD_BAG(bag) (!(bag))

so if (elm && !IS_WEAK_DEAD_BAG(elm)) becomes if (elm && !!elm) but again, the compiler is smart enough to figure out this is the same as if (elm)

@ChrisJefferson
Copy link
Contributor

So, I'm happy for this to be merged because I have convinced myself it won't hurt non-HPC GAP.

I'm concerned we are driving HPC-GAP off a cliff, because (as I understand) we continue to have a non-functional ward so we don't have any guards, so can't really test HPC-GAP is behaving in parallel, but this patch isn't doing any more damage than all the other bits of code making changes to HPC-GAP and GC-relaeted things.

@fingolfin
Copy link
Member Author

I really don't understand the concern for HPC-GAP, or how this is "driving it off a cliff", though.

Also, If any of these changes had a potential for causing harm to HPC-GAP, I'd foremost blame this on the almost complete lack of documentation of many important internals, such as memory barriers -- it is completely unclear why some code was changed to use memory barriers, while other seemingly similar code was not. It leaves the impression that problems were fixed as they were encountered, as opposed to systematically looking through the code for potential sources of problems, and addressing them. In other words: I have serious doubts in the reliability of HPC-GAP code, with or without ward running...

But again: I have a hard time seeing how these localized changes, written in a deliberately incremental, small-step manner, could have an impact on HPC-GAP. After all, the absence of ward means only that we don't check guards, it does not cause e.g. memory barriers to stop working (nor does not using ward suddenly make them non-required). So if there was something deeply wrong here, I am confident it'd show up now, with our without a 100% working ward. If there are HPC-GAP issues with this PR that we don't notice now, ward won't help, and it all boils down to lack of comprehensive testing for HPC-GAP.

tl;dr: if HPC-GAP is screwed, then because of lack of documentation, comprehensive testing, and systematic work on hardening the library and kernel. Not because of this PR...

@fingolfin
Copy link
Member Author

All that said, of course I don't want to break HPC-GAP deliberately, like you, I invested lots of time to keep it working and merged into GAP. But realistically, right now, things seem rather bleak for it, given the total lack of anybody working on it (e.g to resolve the ward problem, or the issues visible in #2484).

@fingolfin fingolfin merged commit 6220a79 into gap-system:master May 25, 2018
@fingolfin fingolfin deleted the mh/gasman-cleanup branch May 25, 2018 13:58
@fingolfin
Copy link
Member Author

@ChrisJefferson upon re-reading your comment, I now realize I misread it the first time; you wrote "but this patch isn't doing any more damage than all the other [...]" but clearly my reading comprehensions skills are not up to the task. My apologies for the rant above, guess it just proves what kind of fool I am sigh

@ChrisJefferson
Copy link
Contributor

Don't worry, as usual, despite being a natural English speaker, writing clear English is not my strong point :)

I actually wrote that reply because I tried writing a multi-threaded torture test for weak pointers in HPC-GAP as a way of testing this patch. It crashed, but I realised the crash was because the locking in my code was wrong. I decided I'm not going to try writing multi-threaded tests in HPC-GAP while it remains guardless.

I agree with your frustration, about the work put into HPC-GAP, and how it would be a shame for it to go to waste, but also not wanting to put in the large amount of work to "finish" it without more people joining in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes topic: kernel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants