Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for overloaded functions in stubgenc generated by pybind11 #5975

Merged
merged 19 commits into from
Jan 29, 2019

Conversation

wiktorn
Copy link
Contributor

@wiktorn wiktorn commented Nov 29, 2018

For overloaded methods pybind generates following docstring:

__init__(*args, **kwargs)
Overloaded function.

1. __init__(self: TestClass, arg0: str) -> None

2. __init__(self: TestClass, arg0: str, arg1: str) -> None

For such docstrings stubgenc currently produces following stub:

def __init__(self, *args, **kwargs) -> Any: ...

With this change stubgenc will generate following stub:

@overload
def __init__(self, arg0: str) -> None: ...
@overload
def __init__(self, arg0: str, arg1: str) -> None: ...

This pull request introduces:

  • class - FunctionSig - representing signature of function/method (function name, arguments, return value)
  • class - ArgSig - representing information about function argument (argument name, type, default value)

infer_sig_from_docstring now returns list of FunctionSig . List contains one object when function is not overloaded, more objects - when it's overloaded. Each object represents signature of function.

infer_arg_sig_from_docstring is a new function that parses argument list for function in docstring. As the type information may contain commas, the code checks, if the comma is outside brackets. I tried to approach that with regex and failed, also tried to use ast module, but its not straightforward to move from ast form back to source code form that's why I settled with naive approach, as this is not performance critical path. Tests for edge cases included in test_infer_sig_from_docstring

@gvanrossum
Copy link
Member

gvanrossum commented Dec 1, 2018

For such docstrings stubgenc will produce following stub:

I presume you mean that stubgenc should produce such a stub, and this PR implements that behavior? (We prefer our commit messages to describe the change rather than the new state of affairs.)

I will try to review soon, but I think this has missed the train for the upcoming 0.650 release (see #5960).

@wiktorn
Copy link
Contributor Author

wiktorn commented Dec 1, 2018

Ok, updated description.

@gvanrossum
Copy link
Member

Hm... Having looked at this a bit more, it really feels like string partitioning is a pretty poor way of parsing something of this complexity. I realize that all of stubgen looks like a hack, but since you are doing your best to make it better, perhaps you can improve the approach to parsing a bit more? Maybe write a tiny recursive-descent parser, using the tokenize module for tokenization?

@wiktorn
Copy link
Contributor Author

wiktorn commented Dec 4, 2018

Thank you for pointing me to tokenize module. Looks like this is something I was looking for. Looks like it is time for me to refresh my knowledge and re-read library reference :-)

I'll start with replacing parsing docstring function declarations. Though quick scan of the code suggests, that some regex usage will remain in the stubgenc/stubutil.

@gvanrossum
Copy link
Member

gvanrossum commented Dec 4, 2018 via email

@gvanrossum
Copy link
Member

Hey @wiktorn, just to let you know, we're planning fairly extensive refactorings of part of stubgen, however, we don't expect the docstring parsing to be affected, so don't worry!

@wiktorn
Copy link
Contributor Author

wiktorn commented Dec 15, 2018

Thank you for the heads up @gvanrossum . I'll finally have more time next week to polish this PR.

@wiktorn
Copy link
Contributor Author

wiktorn commented Jan 12, 2019

@gvanrossum Can you trigger another AppVeyor build? My guess that this failure was totally unrelated to my changes (some exceptions in IPC and ast3 module that are not mentioned in this changes)

@emmatyping
Copy link
Collaborator

@wiktorn if you rebase on master those errors should go away.

@wiktorn wiktorn force-pushed the stubgenc_pybind_arg_type_object branch from 1530871 to ae08bd3 Compare January 12, 2019 23:19
@gvanrossum
Copy link
Member

Sorry I haven't got to review this yet! I missed that you had updated it because you used git push -f. In the future please leave the old revisions alone and just push new changes, it's easier to review.

Note that when #6256 lands this will become one big merge conflict, but according to Ivan it's easy to resolve -- he just moved all the docstring parsing logic to a new file, stubdoc.c.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a few review comments anyway. Sorry again!

mypy/stubgenc.py Outdated Show resolved Hide resolved
mypy/stubgenc.py Outdated Show resolved Hide resolved
mypy/stubutil.py Outdated Show resolved Hide resolved
mypy/stubutil.py Outdated Show resolved Hide resolved
mypy/stubutil.py Outdated Show resolved Hide resolved
mypy/stubutil.py Outdated Show resolved Hide resolved
mypy/stubutil.py Outdated Show resolved Hide resolved
@wiktorn
Copy link
Contributor Author

wiktorn commented Jan 26, 2019

I hope that's ok right now. Just let me know how you want to move forward. Shall I till #6256 lands in master and merge master back to this branch?

Sorry for the force push, that's how I understood @ethanhs and since there was no comments on code yet, I've decided to rebase instead of merging master

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG! This looks ready to merge.

I'm leaving it up to @ilevkivskyi which one to merge first: this one (then he has to merge your code back into #6256) or the latter (leaving the merge up to you).

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be glad to merge this before my PR, but I have a bunch of additional comments.

mypy/stubutil.py Outdated Show resolved Hide resolved
mypy/stubutil.py Outdated Show resolved Hide resolved
mypy/stubutil.py Outdated Show resolved Hide resolved
mypy/stubutil.py Outdated Show resolved Hide resolved
mypy/stubutil.py Outdated Show resolved Hide resolved
mypy/test/teststubgen.py Outdated Show resolved Hide resolved
mypy/test/teststubgen.py Show resolved Hide resolved
mypy/test/teststubgen.py Outdated Show resolved Hide resolved
mypy/test/teststubgen.py Show resolved Hide resolved
mypy/test/teststubgen.py Show resolved Hide resolved
Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there are bunch of comments from my previous review that are not implemented. I tried to explain some of them more, in case you didn't understand what to do. Are you going to implement them?

If no, I would probably merge this anyway and fix them myself, but it would be better if you can do this.

mypy/stubgenc.py Outdated
if name == 'getitem':
return '(index)'
return [TypedArgSig(name='index', type=None, default=False)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you would use normal classes instead of named tuples you could define __init__ as:

    def __init__(self, name: str, type: Optional[str] = None, default: bool = False) -> None: ...

and then this an d a dozen others will be just TypedArgSig('index').

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also with normal classes you can define __repr__(), to simplify test cases significantly. You can just check that str(<generated signature>) matches an expected string.

mypy/stubgenc.py Outdated
return [
TypedArgSig(name='name', type=None, default=False),
TypedArgSig(name='value', type=None, default=False)
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For multiline we use this style:

return [TypedArgSig(name='name', type=None, default=False),
        TypedArgSig(name='value', type=None, default=False)]

(also many of these will not need to be multiline if you implement the suggestion above).

mypy/test/teststubgen.py Show resolved Hide resolved
TypedArgSig -> ArgSig
TypedFunctionSig -> FunctionSig
@wiktorn
Copy link
Contributor Author

wiktorn commented Jan 29, 2019

I've missed your comments on conversation tab on GitHub. As I reviewed changes I found your comments and started to address them.

@ilevkivskyi
Copy link
Member

@wiktorn Currently there is one more big change that I proposed: switch from named tuples to normal classes, see #5975 (comment) and #5975 (comment)

After that you could also remove redundant type=None etc. For the tests just use:

class ArgSig:
    ...
    def __repr__(self) -> str:
        r = self.name
        if self.type:
            r += ': ' + self.type
        if self.default:
            r += ' = ...' if self.type else '=...'
        return r

class FunctionSig:
    ...
    def __repr__(self) -> str:
        args = ','.join([str(arg) for arg in self.args])
        return '{}({}) -> {}: ...'

Then many tests can be simplified to just check the string representation of the result. You can still define __eq__() if you want to use direct equality checks in some tests.

@ilevkivskyi
Copy link
Member

(also there is a lint failure now)

@wiktorn
Copy link
Contributor Author

wiktorn commented Jan 29, 2019

And what do you think about this approach:

ArgSig = NamedTuple('ArgSig', [
    ('name', str),
    ('type', Optional[str]),
    ('default', bool)
])

ArgSig.__new__.__defaults__ = (None, False)

As far as I see, it works in Python3.4, and once oldest supported version will be 3.7, we can move this declaration to NamedTuple call itself.

I'm reluctant to use repr strings in tests though.

@ilevkivskyi
Copy link
Member

ArgSig.__new__.__defaults__ = (None, False)

With this call sites will not type-check by mypy.

I'm reluctant to use repr strings in tests though.

Why? If you just want to be sure it is clear from the string form it is a custom class, use something like "FunctionSig('method(self, arg: int) -> Any')" it is still quicker to grasp that the current form. I however don't have strong opinion about this, unlike about the default arguments.

@wiktorn
Copy link
Contributor Author

wiktorn commented Jan 29, 2019

ArgSig.__new__.__defaults__ = (None, False)

With this call sites will not type-check by mypy.

As I'm testing it right now, it rather doesn't pass self-check:
error: "Callable[[Type[NT], str, Optional[str], bool], NT]" has no attribute "__defaults__"

Though it properly detects problems in call sites

error: Argument "type" to "ArgSig" has incompatible type "int"; expected "Optional[str]"
error: Argument "default" to "ArgSig" has incompatible type "str"; expected "bool"

So I'll convert them to normal classes.

I'm reluctant to use repr strings in tests though.

Why? If you just want to be sure it is clear from the string form it is a custom class, use something like "FunctionSig('method(self, arg: int) -> Any')" it is still quicker to grasp that the current form. I however don't have strong opinion about this, unlike about the default arguments.

We could actually have __str__ method which would provide stub signature. The only problem I see is that to cover whole API contract, we would need still to check if types in ArgSig's are properly set, as they are used to add necessary imports

@ilevkivskyi
Copy link
Member

ilevkivskyi commented Jan 29, 2019

Though it properly detects problems in call sites

I meant that mypy will flag as errors the calls with less arguments like ArgSig('index') if you would use the __defaults__ pattern.

So I'll convert them to normal classes.

OK

The only problem I see is that to cover whole API contract, we would need still to check if types in ArgSig's are properly set, as they are used to add necessary imports

Yes, this could be useful only for testing. Anyway, I don't insist on this. You can just define __eq__() instead and keep your current tests.

Create ArgList class instead of NamedTuple and provide defaults for type and
default
@wiktorn
Copy link
Contributor Author

wiktorn commented Jan 29, 2019

I did not add defaults to FunctionSig as it doesn't shorten that much and forces to give a thought, if arguments are properly set.

mypy/stubgenc.py Outdated Show resolved Hide resolved
Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I am going to merge this now. There are some minor formatting changes, but I will take care of those.

@ilevkivskyi ilevkivskyi merged commit d6aef70 into python:master Jan 29, 2019
@ilevkivskyi
Copy link
Member

@wiktorn Thanks for contributing! I have successfully merged my PR on top for yours.

@wiktorn
Copy link
Contributor Author

wiktorn commented Jan 29, 2019

Thank you for your patience @ilevkivskyi and help with getting this merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants