-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flag for inline typechecking #208
Comments
I'm curious as to how you would implement this without the import hook. |
My idea is to get the source of the function, parse it using |
That still sounds like you would need to do it in the import hook. Otherwise how would you change the code after the fact? |
I'm not sure I understand what you mean. Here's a minimal prototype of my implementation. |
Hm, this actually looks pretty clever. You're basically recompiling the function on import. Nested functions decorated this way could be problematic, but right now I don't see a problem for top level functions and methods. Surely this only works from py3.9 onwards (I didn't know about |
Great! Do you have a timeline for v3.0 release? |
Hi, it's been a while. I'm preparing the 3.0 release (at beta 2 as of now), so if you'd still like to get this into typeguard, you could make a PR as a starting point. |
Sure, I'm happy to contribute it. Do you think the above approach is reasonable and would work with the rest of the library? |
I think it would. The parameter name could possibly be named differently, like |
It looks like the |
Ah, this is because it's supported natively in the |
But |
Oh wait, I meant from 3.9. Where did you find that |
Oh, I got stuck in the first paragraph of the README which states: |
I would actually like to get rid of the |
Hmmm, that's interesting. I think if you only care about typechecking certain classes or functions, the decorator is still the best way of doing so, no? Additionally, from what I can tell, quite a few library users aren't so keen on monkey patching since a small statement can have such global and potentially unpredictable effects, as well as it being hard to predict how several monkey patches will interact with each other. As I see it I think offering both would still be useful. Maybe instead we could try to maximize the amount of shared code between these approaches? |
Yes, if |
I think I'm missing something here. So first, we get the source file with |
Yeah, I agree with getting rid of So |
Do we really need to |
I'm pretty far along. The only problem I have left is that the memo creation in the injected code is raising |
How do you do that? Can you compile the AST directly to bytecode? Awesome! Could you share a draft PR or a branch showing your progress? |
As it says in the documentation of the Compile the source into a code or AST object. Code objects can be executed by exec() or eval(). |
So to be clear: I am compiling the AST to a module, |
The juicy bits: class Instrumentor(ast.NodeVisitor):
def __init__(self, target_name: str):
target_name = target_name.replace(".<locals>", "")
self.target_name = target_name
self.path: list[str] = []
self.transformer = TypeguardTransformer("_call_memo")
self.modified_function: ast.FunctionDef | ast.AsyncFunctionDef | None = None
def visit_Module(self, node: ast.Module) -> Any:
self.generic_visit(node)
if self.modified_function:
node.body[:] = [self.modified_function]
ast.fix_missing_locations(node)
def visit_ClassDef(self, node: ast.ClassDef) -> None:
self.path.append(node.name)
self.generic_visit(node)
del self.path[-1]
def visit_FunctionDef(self, node: ast.FunctionDef | ast.AsyncFunctionDef) -> None:
self.path.append(node.name)
full_path = ".".join(self.path)
self.generic_visit(node)
if full_path == self.target_name:
self.modified_function: ast.FunctionDef = self.transformer.visit(node)
self.modified_function.decorator_list.clear()
self.modified_function.body.insert(0, ast.Import([ast.alias("typeguard")]))
del self.path[-1]
def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> None:
return self.visit_FunctionDef(node)
def instrument(f: T_CallableOrType) -> Callable | str:
if not getattr(f, "__annotations__", None):
return "no type annotations present"
elif not getattr(f, "__code__", None):
return "no code associated"
elif not getattr(f, "__module__", None):
return "__module__ attribute is not set"
module = sys.modules[f.__module__]
module_source = inspect.getsource(sys.modules[f.__module__])
module_ast = TypeguardTransformer("_call_memo").visit(ast.parse(module_source))
instrumentor = Instrumentor(f.__qualname__)
instrumentor.visit(module_ast)
if instrumentor.modified_function is None:
return "cannot find function in AST"
code = compile(module_ast, module.__file__, "exec", dont_inherit=True)
new_globals = {}
exec(code, new_globals)
new_function = new_globals[f.__name__]
update_wrapper(new_function, f)
new_function.__globals__.clear()
new_function.__globals__.update(f.__globals__)
return new_function |
I'm down to 20 failures. Most of them are due to either the code not yet supporting |
I'm having trouble with closures. Like here: def test_classmethod_return_valid(self):
class Foo:
@classmethod
@typechecked
def method(cls) -> Self:
return Foo()
Foo.method() The injected code in |
Ok, looks like the problem is that the newly generated code object doesn't have the free variables the original one didn't, and that makes the closure incompatible with the new code. I have to trick the compiler to give it the same free variables. |
I managed to fix the test I was focusing on, but a bunch of other tests broke in the process 😓 |
Agh, you can construct a |
Ok, so turns out |
Sweet! Good progress! Are the issues with Self a problem in 3.11, or just in older versions with |
The problems with |
Ok, the problem with |
Managed to fix the |
I fixed all the test failures, except on PyPy where I still get 3 regex mismatches on the error messages, like:
|
I've pushed my changes so far and opened a draft PR as #289. |
I didn't mark the PR as fixing this issue since the actual feature you asked for is still missing. This new PR is just the foundation for that work. |
Strangely, all the failing tests seem to work just fine when run individually. Looks like a test isolation problem. |
Oh, interesting. Is this a known problem with pytest in PyPy? |
I don't think the problem has anything to do with pytest, but is, rather, an artifact of PyPy's garbage collection. Curiously, if I run |
Looks like my initial diagnosis was incorrect. The problem seems to lie in the way |
I managed to fix the test flakiness. The PR still needs quite a bit of polishing before I'm ready to merge it to master, but I think it's safe to say that the hardest part is over now. |
A bit of rubber ducking here 😃 As I continue to refactor the PR, I encountered a tricky issue. If there's a static method in a class within a function, like: def container():
class Foo:
@typechecked
@staticmethod
def mymethod(x: int) -> None:
pass This would be rewritten as: def container():
class Foo:
@staticmethod
def mymethod(x: int) -> None:
from typeguard import CallMemo, check_argument_types
_call_memo = CallMemo(Foo.mymethod)
check_argument_types(_call_memo) In its current state, the instrumented code fails because it creates a new cell for Now I have two options: either I create a non-conflicting local variable that holds a reference to the function object, or I create a chain of SimpleNamespace objects that would let me keep the reference as it is. Then again, this would be invisible to the end user and would just incur an extra cost in the form of attribute lookups, so maybe I should stick to the method I accidentally discovered and keep adding the function directly to the closure. I just need to tell the AST transformer to change the reference to match if the target function is nested. |
Is this completely solved now? I have some more time this week. |
No, it's not solved yet. This work exhausted me and another project required my attention. I will be able to continue work next Saturday, at the earliest. |
The basic idea is to find a non-conflicting name and assign the function object to the closure under that name so it can be used to create a |
I managed to fix the problem now, and pushed the changes. I'm still not quite done with the PR though, but soon. |
@marcelroed I marked the PR as ready. There's room for optimization but I'm pretty satisfied with the results thus far. Would you like to review it before I merge? I realize it's a large, complicated changeset so I don't expect you to do any in-depth inspection, just maybe playing around with it a bit. |
Awesome, I'll have a look! Are there any aspects you feel like don't have enough tests? |
If there are bugs, they would most likely triggered by a pathological combination of nested functions, classes in functions and other abominations like that. I'm not 100% confident about generators (vanilla or async) but there are some tests trying to ensure that they work right. |
BTW, I've already started work on the actual feature you requested. AST transformation checks are now working in 4 cases:
What's still missing:
|
I've created a new draft PR that builds on the previous one. I resolved the issues I wrote about in my previous comment, and all forms of assignments to local variables and arguments should be guarded by type checks now. |
I need to improve the type checking error message emitted by bad variable assignments. Something to be done later! |
@marcelroed could I get some comments on the PR? Does it do what you want? |
Yep, working on it now! |
Is your feature request related to a problem? Please describe.
I often need to typecheck calls to functions that are outside of my control, since they cause bugs further down the line.
For example, calling external functions with multiple return types I sometimes need to force one of them.
My specific issue relates to using
torchtyping
with PyTorch.In this context, the type matters at every step, and also considerably helps with documenting the code, as currently I need to add comments for the shape of the tensors everywhere.
Describe the solution you'd like
I would like to add a flag to the typechecked decorator that adds support for inline typechecking. It could look like this.
Describe alternatives you've considered
I've considered
check_type
function. Results in a lot of boilerplate and still requires me to add an inline type hint for my IDE to understand the type.I'm willing to submit an implementation if this is approved. Thanks!
The text was updated successfully, but these errors were encountered: