Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design and build a plugin architecture #1240

Closed
gvanrossum opened this issue Feb 24, 2016 · 26 comments
Closed

Design and build a plugin architecture #1240

gvanrossum opened this issue Feb 24, 2016 · 26 comments
Assignees
Labels
needs discussion priority-0-high topic-plugins The plugin API and ideas for new plugins

Comments

@gvanrossum
Copy link
Member

One area where this comes up is SQLAlchemy table definitions.

@jstasiak
Copy link
Contributor

I was just gonna create a similar ticket, code using Injector[1] would be another case here.

[1] https://github.com/alecthomas/injector

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 7, 2016

I'm enumerating here a bunch of things that this might be good for. These fall roughly under two categories -- metaclasses (or more generally runtime configuration of classes) and callables with atypical signatures.

Metaclasses:

  • SQLAlchemy
  • Django models
  • (and ORMs in general)

Functions with special signatures (some of these are of marginal utility):

  • open() (and similar functions elsewhere; return type depends on mode string argument)
  • re.findall (return type depends on structure of a regular expression)
  • sys.exc_info (return value has no None items within except: block)
  • universal_newlines in subprocess (boolean toggles between str / bytes)
  • str.format and '...' % x (the latter is currently special cased in the type checker)
  • dict(key=value) and dict.update with keyword arguments (the prior is currently special cased)
  • sum (works for anything that supports +)
  • Parts of itertools and functools
  • struct (could infer type from format string)
  • SQLAlchemy queries

For some of these I can also imagine more general type system support instead of a plugin.

@jstasiak
Copy link
Contributor

jstasiak commented Jun 7, 2016

Two examples of libraries that generate modules and classes based on some form of data schema:

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 14, 2016

Other functions with tricky signatures:

  • pow (return type varies based on the sign of the exponent)
  • __pow__ (similar)

@dmoisset
Copy link
Contributor

dmoisset commented Jul 7, 2016

Things like numpy also fall here; some functions accept a "dtype" argument that controls the type of the output. Some functions return values or arrays depending on the arity of the "axis" argument.

@mmerickel
Copy link

zope.interface is a perfect candidate to add to this discussion as it's often used for this exact purpose in many frameworks including twisted and pyramid.

@miohtama
Copy link

I have not followed the latest MyPy developments actively, but here's my two cents (Euro). There are some existing type systems out there that can be looked upon as an example. One is TypeScript where one can supply the type definitions as a separate declaration file: https://www.typescriptlang.org/docs/handbook/declaration-files/introduction.html This is done e.g. for jQuery that is a very popular legacy project and cannot be fitted with type declarations in the source code itself.

Based on TypeScript inspiration, one approach for a type declaration architecture for framework/metaclass use case could be

  • There is a standard for a type declaration file

  • Each project using their own flexible MetaClass and property based typing systems (SQLAlchemy, zope.interface) can supply a tool for walking the source hierarchy, either as a batch process or run time, scanning all the classes and creating type definitions for them. In this approach, one doesn't need to upgrade existing framework run-times like SQLAlchemy, with new type hinting directives but tool can work on old cod bases as well.

  • Generated type definitions are saved in a type declaration file

  • MyPy, IDEs, and other tools can consume type declaration files and match type declarations and classes e.g. by the dotted name

  • If source code changes one simply has to run framework specific type declaration generation tool again

  • This approach does not solve the dynamic return type issues like re.findall and sqlalchemy.Query(MyModel).all() -> Iterable[MyModel] and this question is still left open

@dmoisset
Copy link
Contributor

@miohtama mypy follows a similar approach to the idea of allowing separate annotations for third party files that can not be annotated inline, so the idea of a "generator" can be useful in some cases. But I think caching the generated result might be a premature optimization, I think probably a first step should be to provide an API for plugins to walk "flexible" modules statically and programatically return the data types. with that API you could integrate it directly into mypy (and ensure you're always type-checking an up-to-date definition) and also in other "stub generator" tool if you want more performance (the first option would be nicer when developing, and the second one for things like CI, or IDEs).

@JukkaL
Copy link
Collaborator

JukkaL commented Jan 26, 2017

@miohtama It seems that stubs are already (sometimes) suitable for what you are thinking about. I agree with @dmoisset that it would be better if the plugins would be deeply integrated to mypy so that there would be no need to run a separate generator tool. In order to always keep the generated stubs in sync with changes in user code, we'd probably have to run the generator tool before each mypy run, and this could actually slow down type checking significantly, as a large project may want to run multiple generators. (A large project probably depends on many third-party libraries, many of which may benefit from a stub generator.) If the plugins are integrated to mypy, there doesn't have to be significant performance overhead, assuming that the plugins only work at the AST/static checking level, i.e. they don't actually try to import user code to Python runtime.

@JukkaL JukkaL self-assigned this Jun 5, 2017
JukkaL added a commit that referenced this issue Jun 7, 2017
…#3501)

Implement a general-purpose way of extending type inference of
methods. Also special case TypedDict get and `int.__pow__`.
Implement a new plugin system that can handle both module-level 
functions and methods.

This an alternative to #2620 by @rowillia. I borrowed some test
cases from that PR. This PR has a few major differences:

* Use the plugin system instead of full special casing.
* Don't support `d.get('x', {})` as it's not type safe. Once we
  have #2632 we can add support for this idiom safely.
* Code like `f = foo.get` loses the special casing for get.

Fixes #2612. Work towards #1240.
@chadrik
Copy link
Contributor

chadrik commented Jun 7, 2017

Here are some thoughts on the user-plugin aspect of this PR.

Plugin discovery options

  • A. by file path. e.g. /path/to/myplugin.py. could also extend this with a MYPY_PLUGIN_PATH
    • pro: easier to write test cases (I discovered that placing a file on the PYTHONPATH within the tests was difficult, likely by design)
    • con: can't use pip to install plugins
  • B. by dotted path: e.g. package.module
    • pro: easy for users to create pip-installable plugins
    • con: adding plugin modules and their requirements to the PYTHONPATH could interfere with type checking?
  • C: setuptools entry points. e.g.:
    setup(
        entry_points={
            'mypy.userplugin': ['my_plugin = my_module.plugin:register_plugin']
        }
    )

Other questions

  • shall we always look for an object within the module with a designated named, e.g. Plugin, or make this configurable as well? e.g. package.module.MyPlugin or /path/to/myplugin.py:MyPlugin (Note: I've already written some functionality for the latter in my hooks branch)
  • do user plugins need to inherit from mypy.plugin.Plugin?

Plugin chainability options

  • A. aggregate all user plugins into a single uber-plugin instance.
    • each method on this aggregate plugin would cycle through its children in order until one returns a non-None result. we could then cache the mapping from feature (e.g. 'typing.Mapping.get') to user-plugin instance to speed up future lookups.
    • this is compatible with the current design which passes a single Plugin instance around.
  • B. register a plugin per feature. this allows you to replace the search with a fast dictionary lookup, as well as detect up-front when two plugins contend for the same feature.

@gvanrossum gvanrossum added the topic-plugins The plugin API and ideas for new plugins label Jun 13, 2017
@gvanrossum
Copy link
Member Author

How much more before we can close this?

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 14, 2017

I'll create separate issues for the various things we could use plugins for and close this issue.

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 14, 2017

I added various issues about how we could use the plugin system. Feel free to create new issues for additional things the plugin system could be useful for. Closing this issue -- future discussions will happen elsewhere.

@JukkaL JukkaL closed this as completed Jun 14, 2017
@jstasiak
Copy link
Contributor

Hi, I hope I'm not too late for the party here. After watching "Idris: Type safe printf" (https://www.youtube.com/watch?v=fVBck2Zngjo) I had some ideas for "atypical signature" kind of plugins which I think I can combine in a code sample below.

Let's say we want to model something like functools.partial (assuming it's a regular function):

def partial_type(arguments: Arguments) -> Callable:
    callable = arguments.by_name('callable')
    
    # Below we walk the provided arguments and keyword arguments types, make sure
    # they match the types of the callable parameters and amend the callable type
    # to return to express the fact that some arguments are already provided

    for a in arguments.args:  # arguments.args is a list of types
        corresponding_parameter = callable.parameters.by_index(0)
        assert a == corresponding_parameter.type
        callable.parameters.pop_by_index(0)

    for name, value in arguments.kwargs:  # arguments.kwargs is a list of tuples of (str, type)
        corresponding_parameter = callable.parameters.by_name(name)
        assert value == corresponding_parameter.type
        callable.parameters.pop_by_name(name)

    return callable
    
@dynamic_return_type  # or something
def partial(callable: Callable, *args, **kwargs) -> partial_type:
    # ...

What I believe is nice about approach like this:

  • no plugin registration needed
  • code that generates return type can be declared almost inline
  • no classes or inheritance, only plain functions

The hard part here is mypy (and anything using those) would have to actually execute part of the source code being analyzed (possibly restricted to some pure subset).

Food for thought.

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 14, 2017

@jstasiak A proposal similar to yours has been discussed before, and we decided to have the plugins live outside user code. Here are some of the primary reasons why we didn't go with 'inline plugins':

  • Mypy is a static checker that doesn't execute the code being checked, and running some parts of the checked application code would confuse the mental model users have.
  • The plugins may have complex dependencies on mypy internals, and we don't want user code to depend on mypy internals.
  • The plugin and mypy internal APIs are not going to be stable, so we will actually discourage people from writing their own plugins that live outside the mypy repo unless they have a very pressing need. Making the programming model as simple as possible is thus not very high priority -- making the plugin system easy to implement, easy to evolve and flexible are more important, and the current design seems to fit those priorities pretty well.
  • For library modules mypy uses typeshed, and we don't want to add mypy-specific plugin functionality to typeshed, since the stubs are used by other tools as well. We also don't want to standardize a cross-tool plugin API (at least right now) since it would be too hard to do and restrict our flexibility to evolve the plugin system.

@SirEdvin
Copy link

SirEdvin commented Jul 10, 2018

@JukkaL , em ... And what I should do, if I want to provide type hinting with mypy for my custom dataclasses library? Write plugin directly in mypy source code? And every single man, that use some python magic should do the same?

@gvanrossum
Copy link
Member Author

gvanrossum commented Jul 10, 2018 via email

@lubieowoce
Copy link

@JukkaL Given the team's stance on user plugins ("we will actually discourage people from writing their own plugins that live outside the mypy repo"), I wanted to ask: would it okay if I linked my notes for developing one here? (prefixed with huge disclaimers about API instability etc) Since the API isn't really documented¹, it took me quite a while to figure out how to get anything going, and I thought I might save someone that effort. Of course, if you'd prefer to keep a higher barrier to entry, I'll respect that.

¹ I couldn't find anything except for this autogenerated docpage, GVR's comment above, and some bits of information scattered around the issue tracker.

@ilevkivskyi
Copy link
Member

we will actually discourage people from writing their own plugins that live outside the mypy repo

Where do you read this? I don't think this is true.

@SirEdvin
Copy link

@ilevkivskyi Here:
Comment screenshot

From this comment.

@ilevkivskyi
Copy link
Member

OK, thanks! I think this statement is outdated (and maybe was more in response to concrete proposal). Although the plugin API is still unstable (i.e. no guarantees about backwards compatibility), we now support user installed plugins.One can just install a mypy plugin using pip install and activate it in mypy.ini by writing plugins = plugin_a, plugin_b.

Maybe @JukkaL can add more.

@lubieowoce
Copy link

lubieowoce commented Sep 19, 2018

@ilevkivskyi I'm glad to hear that! From reading the code, I also figured out that you can do
plugins = user_plugin_a:some_function
to use user_plugin_a.some_function(...) as the hook mypy will call to obtain the actual UserPluginA(mypy.plugin.Plugin) class-object. (user_plugin_a.plugin(...) will be used be default)

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 19, 2018

Yeah, you can now have plugins that are installed separately from mypy. I still think that it may be worth having plugins for some 3rd party libraries in the mypy repo, but that really only makes sense if the APIs are relatively stable. There are at least three reasonable ways to maintain a plugin:

  1. The plugin lives completely separately from mypy. The main tradeoff here is that it's possible that mypy changes break your plugin, forcing you to update your plugin every once in a while. There is also a risk for users -- at some point in the future the plugin maintainers may lose interest and the plugin will stop working with recent mypy versions.
  2. The plugin lives in the mypy repository. Here the main benefit is that the mypy team will ensure that the plugin keeps working even if the plugin API changes. On the other hand, if the library API is changing rapidly or in incompatible ways, there may be incompatibility issues that are hard to work around, and the plugin may become out of date with respect to the library API. This obviously requires that the mypy team is ready to maintain the plugin.
  3. Start with a plugin that is separate from mypy and move it to the mypy repository once things are stable enough.

@lubieowoce
Copy link

lubieowoce commented Sep 19, 2018

@JukkaL since you're here, is there a way to safely inject additional definitions into a file from a hook? If my plugin has a class decorator hook, I can access the classes MypyFile by poking around in the ClassDefContext my hook receives, but adding anything to the file's defs list sounds like a recipe for trouble since the list is being iterated over when the hook is called.

(I need to turn this:

@triggers_hook
class A:
    ...

into this:

class A:
    ...

class B(A):
    ...

for typechecking purposes. But since you can't normally do

class A:
    class B(A):
        ...
    ...

, I need to add B at the module level. Although maybe mypy can handle an inner class like that?)

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 20, 2018

@lubieowoce Adding definitions to MypyFile in a plugin is not supported at the moment.

@lubieowoce
Copy link

lubieowoce commented Sep 24, 2018

[edit: I feel like I shouldn't spam this thread with questions, is there a good place for that?]
@JukkaL I got around it by putting the subclass into the parent class, works okay for my purposes 👍

Another question: A method_hook receives a MethodContext object with a context: Context attribute – usually the expression that triggered the hook. Is the hook allowed to modify context in any way, e.g. to transform the AST into a form that's more palatable to mypy?

(in particular, I'd like to turn certain method calls like x.is_Y() into isinstance(x, Z). I'm trying to do that at the AST level, because a literal isinstance(...) seems like the only recognized way to branch on type.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs discussion priority-0-high topic-plugins The plugin API and ideas for new plugins
Projects
None yet
Development

No branches or pull requests