Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check exact equality for worker state #3483

Merged
merged 2 commits into from
Feb 17, 2020
Merged

Conversation

bnaul
Copy link
Contributor

@bnaul bnaul commented Feb 15, 2020

Per #3321 (comment), I agree it makes more sense to check for exact equality instead of relying on the hash (just in case 🙂).

I could see an argument for wrapping this tuple in a property so it can be reused, do we think that's preferable? I'd originally wanted to use identity for hash and eq but that has a bit too much going on IMO.

@bnaul
Copy link
Contributor Author

bnaul commented Feb 15, 2020

cc @StephanErb

@mrocklin
Copy link
Member

Thanks for picking this up @bnaul

This looks good to me. I'll merge after either a +1 from @StephanErb or a couple days.

@mrocklin
Copy link
Member

My guess is that all of the test failures are intermittent. When we changed to spawn by default it unearthed a bunch of things. Squashing down a couple now.

@mrocklin
Copy link
Member

Well, this one is interesting

    @gen_cluster(timeout=1000, client=True)
    def test_recompute_released_key(c, s, a, b):
        x = c.submit(inc, 100)
        result1 = yield x
        xkey = x.key
        del x
        import gc
    
        gc.collect()
        yield gen.moment
>       assert c.refcount[xkey] == 0
E       assert 1 == 0
E         -1
E         +0

Copy link
Contributor

@StephanErb StephanErb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing the follow up. As you have requested my review I have looked into the code a bit deeper. Please see my questions below.

Comment on lines 282 to 285
return type(self) == type(other) and (self.name, self.host) == (
other.name,
other.host,
)
Copy link
Contributor

@StephanErb StephanErb Feb 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if using name and host is correct in all cases:

  • name is optional and could be None
  • host is not necessary unique as multiple workers could be running on the same machine

I am lacking necessary Dask context, so I am wondering why you have not picked address instead? It is used for indexing the WorkerState and the documentation says it is [t]his worker's unique key..

So I would assume, this here should work as expected:

def __hash__(self):
    return hash(self.address)	       

def __eq__(self, other):
    return type(self) == type(other) and self.address == other.address	

Copy link
Contributor Author

@bnaul bnaul Feb 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just pulling items from the identity() dictionary, no personal reason to prefer one to the other. @TomAugspurger @mrocklin would you agree address makes more sense here?

@mrocklin
Copy link
Member

mrocklin commented Feb 16, 2020 via email

@bnaul
Copy link
Contributor Author

bnaul commented Feb 16, 2020

thanks @StephanErb @mrocklin , updated w/ your suggestions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants