Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for <QuerySet>.as_manager() #1025

Merged
merged 3 commits into from
Sep 29, 2022

Conversation

flaeppe
Copy link
Member

@flaeppe flaeppe commented Jun 27, 2022

I've reorganised and extended dynamic manager class generation to support <QuerySet>.as_manager().

This also renames the dynamically generated manager types to align with what those class names will be during runtime, set by Django. See here: https://github.com/django/django/blob/b2eff16806057095c7dd3daa9402ad615e51627f/django/db/models/manager.py#L109-L110
That means that reveal_type(<Model>.<manager>) should output the same type name as type(<Model>.<manager>) (unless there's collisions etc etc.)

Related issues

Closes: #324

Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Awesome work.
But, I want to have another look from someone :)

README.md Outdated Show resolved Hide resolved
@flaeppe
Copy link
Member Author

flaeppe commented Jun 28, 2022

Did you have anyone in mind for a second pair of eyes? Or did you mean for me to ask someone?

@sobolevn
Copy link
Member

@mkurnikov or @intgr or @syastrov

@intgr
Copy link
Collaborator

intgr commented Jun 29, 2022

Sorry, mypy plugin code is entirely foreign to me and I don't have time right now to delve into it.

@ljodal
Copy link
Contributor

ljodal commented Jun 29, 2022

Seems like you managed to implement support for creating these managers inline, does that mean that it's technically possible to support Manager.from_queryset(QuerySet) inline as well?

Oh and I'd be happy to review this, but I'm kinda swamped for the next few days so I might not get to it until Friday.

@flaeppe
Copy link
Member Author

flaeppe commented Jun 29, 2022

Seems like you managed to implement support for creating these managers inline, does that mean that it's technically possible to support Manager.from_queryset(QuerySet) inline as well?

Oh and I'd be happy to review this, but I'm kinda swamped for the next few days so I might not get to it until Friday.

Yes, it should be technically possible (I'm assuming class level/definition, when you say inline) as long as we're not talking about the "full" call Manager.from_queryset(QuerySet)(). But I opted out of supporting it here, since I thought it felt a bit strange. Reasoning being that unwrapping the non-dynamic version would look like something below:

class MyModel(models.Model):
    class ManagerFromMyQuerySet(...):
        ...
    
    objects = ManagerFromMyQuerySet()

Felt more natural to keep force it outside of the model class definition.

@ljodal
Copy link
Contributor

ljodal commented Jun 29, 2022

Seems like you managed to implement support for creating these managers inline, does that mean that it's technically possible to support Manager.from_queryset(QuerySet) inline as well?
Oh and I'd be happy to review this, but I'm kinda swamped for the next few days so I might not get to it until Friday.

[...] as long as we're not talking about the "full" call Manager.from_queryset(QuerySet)(). [...]

This is what I wanted. All our models currently do objects = MyManager.from_queryset(MyQuerySet)(), so I'm having to go through all of them and move the manager definitions outside of the models, which feels a bit pointless, but oh well.

I'd forgotten about that extra call to initialise the model. I assume that causes mypy to not call get_dynamic_class_hook on the plugin and that's the problem with supporting that? Wonder if it would be possible to still support this by using different hooks. Something to dive into when I have time I guess

@flaeppe
Copy link
Member Author

flaeppe commented Jun 29, 2022

... I assume that causes mypy to not call get_dynamic_class_hook on the plugin and that's the problem with supporting that? ...

Yes, exactly, for clarity, the reason it wont work with inlining the full call, seen below, is that it wont trigger the get_dynamic_class_hook

class MyModel(models.Model):
    objects = Manager.from_queryset(MyQuerySet)()

... This plugin hook is called for every assignment to a simple name where right hand side is a function call ...

Ref: https://mypy.readthedocs.io/en/stable/extending_mypy.html#current-list-of-plugin-hooks (docs for get_dynamic_class_hook)

The more fortunate case for .as_manager is that it returns a manager instance, while .from_queryset returns a manager type.

@sobolevn
Copy link
Member

sobolevn commented Jun 30, 2022

There are conflicts right now 😉

@flaeppe flaeppe force-pushed the fix/type-queryset-as-manager branch 2 times, most recently from ad71834 to 086301f Compare June 30, 2022 07:53
@flaeppe
Copy link
Member Author

flaeppe commented Jun 30, 2022

There are conflicts right now 😉

Hehe, yeah. Think it's resolved now

Copy link
Contributor

@ljodal ljodal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look very nice, I've just added some minor nits here and there. I am however seeing this when running this on our largest Django project at work:

project/users/forms/normal.py: error: INTERNAL ERROR: maximum semantic analysis iteration count reached

I'm not sure what's causing this (we're subclassing some Django builtin forms etc), but I assume it's a defer() call somewhere. I can dig into that tomorrow if you need help reproducing it.

mypy_django_plugin/main.py Outdated Show resolved Hide resolved
mypy_django_plugin/main.py Outdated Show resolved Hide resolved
# type won't be defined on variable level.
var = Var(
name=ctx.name,
type=TypeType(Instance(new_manager_info, [AnyType(TypeOfAny.from_omitted_generics)])),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this Any? Shouldn't we take the model from either the queryset or the manager (and potentially validate that they match)?

Copy link
Member Author

@flaeppe flaeppe Jul 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I'm kinda split on if we should display a model here. Since this is when we're outside of a model class definition, e.g.

MyManager = Manager.from_queryset(MyQuerySet)

What I'm thinking about here is what the manager type is during runtime, model won't be set until the model metaclass is called. Which is basically inside model class definition. And, while I have never used it like that, same manager could be used on multiple models..

But perhaps you're right, maybe we should require model arguments for both queryset and manager to match. Since we're basically merging those two arguments into one and the model argument exists for manager and queryset types.

Also, I think AddManagers will change the model argument if called on a different model? But perhaps that should also change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking it would make sense to inspect the generics of the manager and querysets (if possible). If they are different an error message should probably be emitted and in other case use the class from either?

And I'd think that if AddManagers sees a model argument that's not compatible with the one on the manager then an error should be emitted?

I'm okay with not adding this in this PR, but that's at least how I would expect this to behave. I see that it's really an edge case and probably not a big problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've been looking in to exactly that. And it seems possible if we for .from_queryset traverse .bases (as those are types.Instance, we'll get access to Instance.args) and compare model arguments, then on mismatch emit an error.

This should probably happen for both .from_queryset and AddManagers, as you say.

Copy link
Member Author

@flaeppe flaeppe Jul 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a case left to decide on, when model arguments aren't populated for Queryset or Manager. Either:

  • We emit an error and enforce population of the model arg or,
  • We then allow it to be any model, thus just populate it with whatever we got

I'm leaning towards the latter, since one can switch on enforcing via mypy config: disallow_any_generics = true.

For the first case, we could then establish some sort of "auto-fill". Depending on if 1 of the model arguments are populated, if both are missing .from_queryset results in ManagerFromQueryset[Any] while AddManagers just populates it with the model it's processing.

This is basically the table:

  1. Manager and queryset model argument populated -> error if mismatch for both .from_queryset and AddManagers
  2. Only manager model argument populated -> Populate queryset model argument with manager model argument
  3. Only queryset model argument populated -> Populate manager model argument with queryset model argument
  4. No argument populated -> .from_queryset sets Any and AddManagers populates with the model it's processing.

I think this aligns with when using Django builtins e.g. BaseManager.from_queryset(CustomQuerySet), since otherwise we'll enforce people to write custom managers to populate the argument, even though they only have/need a queryset defined. Like this:

class SuperFluousManager(BaseManager[MyModel]):
    ...

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds exactly like what I'd expect 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been looking in to this for a bit and think we should push for it to a future PR, as it's quite an isolated addition but generates quite a bit of lines. And the changes we have here are already large enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed progress to a PR here: flaeppe#29

mypy_django_plugin/transformers/managers.py Outdated Show resolved Hide resolved
mypy_django_plugin/transformers/managers.py Outdated Show resolved Hide resolved
@flaeppe
Copy link
Member Author

flaeppe commented Jul 1, 2022

I'm not sure what's causing this (we're subclassing some Django builtin forms etc), but I assume it's a defer() call somewhere. I can dig into that tomorrow if you need help reproducing it.

Thanks for the review. It'd be very helpful if you could find a repro, I have no idea where to start looking for it really.

@ljodal
Copy link
Contributor

ljodal commented Jul 1, 2022

I dug a bit into the mypy code yesterday and from what I could see the problem isn't guaranteed to be in the file it reports, so I think this is related to something else. The curious thing is that we don't have any .as_manager() calls in our codebase anymore, so either it's from a dependency or it's related to the other changes in this PR.

I'll try to extract just the "refactoring" bit of this PR and see if it reproduces with just that at least, but I'm struggling to understand what causes it. It does happen during processing of the majority of our models though.

Running mypy -v shows Processing SCC of size 748, where SCC stands for strongly connected cluster or something along those lines and 748 is the number of modules it's processing. So it might be a symptom of our codebase being highly entangled (which it is 😅), but the cluster size is the same before this patch so it hasn't changed that.

@ljodal
Copy link
Contributor

ljodal commented Jul 1, 2022

I commented out the new create_new_manager_class_from_as_manager_method method in main.py and it still crashes, so the problem is somehow related to the refactoring of the from_queryset logic 🤔 Guess I'll have to try to step through that code and see where it goes wrong

@flaeppe
Copy link
Member Author

flaeppe commented Jul 2, 2022

Interesting, could it be related to the new bailing out when .from_queryset isn't called at module level?

elif semanal_api.is_class_scope():
# We force `.from_queryset` to be called _outside_ of a class body. So we'll
# skip doing any work if we're inside of one..
return

@ljodal
Copy link
Contributor

ljodal commented Jul 2, 2022

I'm pretty sure it's not related to that. I've had that line there unrelated to this and that worked fine. I'll try to pick the PR apart and test it bit by bit next week to see if I can figure it out.

@ljodal
Copy link
Contributor

ljodal commented Jul 3, 2022

I think it might be related to the name changes for generated managers. I tried to do something similar while working on #1045 and quickly got the same problem in a test repo. Should probably be solvable, but maybe we could try without those changes and do it in a separate PR?

@flaeppe
Copy link
Member Author

flaeppe commented Jul 3, 2022

I think it might be related to the name changes for generated managers. I tried to do something similar while working on #1045 and quickly got the same problem in a test repo. Should probably be solvable, but maybe we could try without those changes and do it in a separate PR?

Sure, we could drop the name change.

I ran in to it too when fiddling with flaeppe#29

Got stuck in AddManagers with a metadata key that didn't match the runtime manager name for a dynamic manager.

I think you should get a trace to the model definition it gets stuck on? Anything special happening with it?

@ljodal
Copy link
Contributor

ljodal commented Jul 3, 2022

I think you should get a trace to the model definition it gets stuck on? Anything special happening with it?

Yeah, I get a trace, but it ends in a random form class (for our user model, which I guess subclasses a few django.contrib.auth forms). It's literally 100s of lines long, if not 1000s, and I couldn't find anything useful there. I tried to check if any single model was processed more than 20 times (which appears to be the iteration limit) in a row and that wasn't the case. Max was 10

@ljodal
Copy link
Contributor

ljodal commented Jul 3, 2022

Yeah, it's related to the manager name. Changing that logic back fixes the issue. I wonder if it might be because the name already exists in the module?

diff --git a/mypy_django_plugin/transformers/managers.py b/mypy_django_plugin/transformers/managers.py
index 6aa00c8..3e9b8fe 100644
--- a/mypy_django_plugin/transformers/managers.py
+++ b/mypy_django_plugin/transformers/managers.py
@@ -318,20 +318,20 @@ def create_new_manager_class_from_from_queryset_method(ctx: DynamicClassDefConte
     symbol_kind = semanal_api.current_symbol_kind()
     # Annotate the module variable as `<Variable>: Type[<NewManager[Any]>]` as the model
     # type won't be defined on variable level.
-    var = Var(
-        name=ctx.name,
-        type=TypeType(Instance(new_manager_info, [AnyType(TypeOfAny.from_omitted_generics)])),
-    )
-    var.info = new_manager_info
-    var._fullname = f"{semanal_api.cur_mod_id}.{ctx.name}"
-    var.is_inferred = True
+    # var = Var(
+    #     name=ctx.name,
+    #     type=TypeType(Instance(new_manager_info, [AnyType(TypeOfAny.from_omitted_generics)])),
+    # )
+    # var.info = new_manager_info
+    # var._fullname = f"{semanal_api.cur_mod_id}.{ctx.name}"
+    # var.is_inferred = True
     # Note: Order of `add_symbol_table_node` calls matter. Case being if
     # `ctx.name == new_manager_info.name`, then we'd like the type to override the
     # `Var`, so the `Var` won't exist in the end.
-    assert semanal_api.add_symbol_table_node(ctx.name, SymbolTableNode(symbol_kind, var, plugin_generated=True))
+    # assert semanal_api.add_symbol_table_node(ctx.name, SymbolTableNode(symbol_kind, var, plugin_generated=True))
     # Insert the new manager dynamic class
     assert semanal_api.add_symbol_table_node(
-        new_manager_info.name, SymbolTableNode(symbol_kind, new_manager_info, plugin_generated=True)
+        ctx.name, SymbolTableNode(symbol_kind, new_manager_info, plugin_generated=True)
     )

@flaeppe
Copy link
Member Author

flaeppe commented Jul 3, 2022

... I wonder if it might be because the name already exists in the module?

Yep, that might be it. I added a test which didn't resolve that properly.

@flaeppe
Copy link
Member Author

flaeppe commented Jul 3, 2022

@flaeppe
Copy link
Member Author

flaeppe commented Jul 3, 2022

@ljodal I pushed an update resolving name collisions with .from_queryset. Do you have a possibility to test with that? See if it helps?

@ljodal
Copy link
Contributor

ljodal commented Jul 3, 2022

Still crashing :(

@sobolevn
Copy link
Member

Can you please rebase this? I am open to get it merged

@flaeppe
Copy link
Member Author

flaeppe commented Aug 29, 2022

I think there's some rework to get done here before merging is possible. Gonna have to get to that first, not sure when I'll be able to though. (I think it's #1045 we have to align changes here with)

@orsinium
Copy link

Is there anything I can help with to have it merged? About 30% of mypy violations we have in our code base are caused by this issue 🙃 We use custom QuerySet manager a lot.

@orsinium
Copy link

@flaeppe the PR seems quite clean and seemingly solving the issue. I can try the patch on our code base if you want to be double sure, though. What kind of rework do you see as blocking? Maybe, I can help with that. And if it's just a matter of rebasing, I can do that for you too.

@flaeppe
Copy link
Member Author

flaeppe commented Sep 27, 2022

@orsinium the changes here depends the version previous to #1045. Basically .from_queryset(...) and .as_manager() shares(d) some helpers/functionality to annotate types. What I did here was to extract the common bits so that it could be reused to annotate both methods.

Although #1045 had to adjust the same logic. So the pieces I extracted conflicts with #1045, basically they look different now. So one kind of have to go find them again and see if/how it can be extracted/shared.

I'll have a look now to see if I'll figure it out

@orsinium
Copy link

Thank you! Lemmino if you need any help.

@flaeppe flaeppe force-pushed the fix/type-queryset-as-manager branch from d927b96 to 29b5c5b Compare September 28, 2022 20:09
@flaeppe
Copy link
Member Author

flaeppe commented Sep 28, 2022

@flaeppe flaeppe requested review from ljodal and sobolevn and removed request for ljodal and sobolevn September 28, 2022 20:19
Copy link
Contributor

@ljodal ljodal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 I've tested it on our codebase and it introduces no new errors. Also converted a few Manager.from_queryset(MyQuerySet) calls and saw no new errors

- case: handles_call_outside_of_model_class_definition
main: |
from myapp.models import MyModel, MyModelManager
reveal_type(MyModelManager) # N: Revealed type is "myapp.models.ManagerFromModelQuerySet[Any]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this really should produce a Manager[MyModel], but if I remember correctly you were thinking of fixing that in a separate PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, exactly. I opened a branch for that in my fork, it's quite a bit of code to get it up and running I think. Probably more suitable for a subsequent PR I'd say

Comment on lines +218 to +223
manager_base.metadata.setdefault("from_queryset_managers", {})
# The `__module__` value of the manager type created by Django's
# `.from_queryset` is `django.db.models.manager`. But we put new type(s) in the
# module currently being processed, so we'll map those together through metadata.
runtime_fullname = ".".join(["django.db.models.manager", manager_name])
manager_base.metadata["from_queryset_managers"][runtime_fullname] = fullname
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these be under the django key?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably. They weren't previously so I kept them the same. Will probably enforce cache clear to work properly. But that might be completely fine too.

Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much!
I propose to merge this PR and adjust any parts later on.

@sobolevn sobolevn merged commit 54d5835 into typeddjango:master Sep 29, 2022
@orsinium
Copy link

It would be very helpful if you try this branch out on your project(s)

Tested the new version, and it works. For posterity, I faced a few issues along the way, but I'm not sure if they are related to this PR:

  1. The minimal version for mypy in constraints might be wrong, I had to upgrade to the latest version.
    pyproject.toml:1: error: Error importing plugin "mypy_django_plugin.main": cannot import name 'has_placeholder' from 'mypy.semanal_shared'
    
  2. There are bunch of new errors related to settings:
    'Settings' object has no attribute 'PAYMENTS_SERVICE_BASE_URL'  [misc]
    Import cycle from Django settings module prevents type inference for 'POD'  [misc]
    

But again, this can be unrelated, I'll investigate it later.

@sobolevn
Copy link
Member

The minimal version for mypy in constraints might be wrong

I've bumped it recently, please check if that's now correct.

@orsinium
Copy link

I've bumped it recently, please check if that's now correct.

You're right, the constraint is correct. I think I've messed up with environments somehow. It failed with mypy 0.961 but the constraint is already mypy>=0.980. So, that's on my side.

@ljodal
Copy link
Contributor

ljodal commented Sep 29, 2022

  1. There are bunch of new errors related to settings:
    'Settings' object has no attribute 'PAYMENTS_SERVICE_BASE_URL'  [misc]
    Import cycle from Django settings module prevents type inference for 'POD'  [misc]
    

This one is due to this: #1163. Basically you most likely import something in the settings file that imports django.conf.settings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Make as_manager() in queryset generic instead of using Any
6 participants