Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement dynamic prefix for subapps #2756

Open
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

achimnol
Copy link
Member

@achimnol achimnol commented Feb 24, 2018

What do these changes do?

This allows use of variables (regular expressions) in the subapp prefixes.

Are there changes in behavior for the user?

Existing codes are not affected.

Related issue number

I am making this PR after discussion with @asvetlov in gitter.

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • I'm already in CONTRIBUTORS.txt
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

@achimnol
Copy link
Member Author

achimnol commented Feb 24, 2018

My implementation approach:

  • Instead of calling add_prefix() to all resources in the subapp at the constructor of DynamicSubAppResource class, I strip the matched prefix from the request to resolve the subapp's routes, cloning the request with a new URL.
    • This would incur some performance overheads, but I tried to implement it "correctly" first.
    • Existing codes that use plain subapp prefixes won't be affected.
    • User-defined web handlers still receive the original request object, so I belive that user codes should not be affected in any way.
  • Overlapped variable names are overrided by subapps. (e.g., a subapp with a /{name}.txt resource with a dynamic prefix /{name} will get "def" for the "name" key of match_info if resolved with "/abc/def.txt")

@codecov-io
Copy link

codecov-io commented Feb 24, 2018

Codecov Report

Merging #2756 into master will decrease coverage by 0.08%.
The diff coverage is 92.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2756      +/-   ##
==========================================
- Coverage   97.97%   97.88%   -0.09%     
==========================================
  Files          39       39              
  Lines        7464     7528      +64     
  Branches     1310     1322      +12     
==========================================
+ Hits         7313     7369      +56     
- Misses         48       52       +4     
- Partials      103      107       +4
Impacted Files Coverage Δ
aiohttp/web_app.py 99.09% <100%> (ø) ⬆️
aiohttp/abc.py 100% <100%> (ø) ⬆️
aiohttp/web_urldispatcher.py 98.14% <92%> (-1.08%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 39831ca...245be63. Read the comment docs.

@achimnol
Copy link
Member Author

achimnol commented Feb 24, 2018

To allow access to all overriden variables, we could use ChainMap / MultiDict instead of dict for building UrlMappingMatchInfo class, but I'm not sure if just enumerating multi-values in the nesting order (like its _apps property) would be sufficient or not.
@asvetlov How about your ideas?

@achimnol
Copy link
Member Author

Okay, I implemented using ChainMap. 😉

achimnol added a commit to lablup/backend.ai-manager that referenced this pull request Feb 24, 2018
* Utilize aiohttp's nested application architecture to modularize.

  - Requires aio-libs/aiohttp#2756 to be merged.

* Each extension module must provide:

  - "create_app()" module-level function.
    It should return a pair of an aiohttp.web.Application instance and a
    list of global middlewares.
    If the app instance has "prefix" property key, this will be used as
    the URL prefix: "/v{version:\d+}/PREFIX".  If not set, the module
    name will be used instead.

    > The request.app in global middleware will be the root app instead
      of the extension app.

      To pass the extension app to a global middleware, you should wrap
      it with "web.middleware(functools.partial(mwfunc, app))" in the
      "create_app()" function.
      See example: ai/backend/gateway/ratelimit.py

* Each extension module have access to several global objects such as
  AgentRegistry and the database connection pool.
  The exact list of available "public APIs" is at "server_main()"
  function of ai/backend/gateway/server.py: property set from app to
  subapp.

* Now admin, auth, etcd, events, kernel, server, and vfolder became
  "intrinsic" extension modules that are loaded always and
  automatically.

  - You can add arbitrary app modules which are importable Python
    package module names specified in the BACKEND_EXTENSIONS environment
    variable as comma-separated strings.
@@ -1003,7 +1005,7 @@ def test_url_for_in_resource_route(router):

def test_subapp_get_info(app, loop):
subapp = web.Application()
resource = subapp.add_subapp('/pre', subapp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@asvetlov
Copy link
Member

@achimnol thanks for PR.
It covers 99% but I see the only one problem: what if {name} is present in both dynamic rules: once in subapp prefix itself and other one in routed sub application?

I see the solution:

  1. Add .parent property to UrlMappingMatchInfo. For nested apps it will point on subapp prefix route.
  2. Don't derive UrlMappingMatchInfo from dict but implement collections.abc.Mapping interface and __getitem__ method. The method can look into .parent if the key is absent.

Right now I don't care about performance, clean API is the first target. Later we can add Cythonized version of UrlMappingMatchInfo if needed.

What do you think about?

@achimnol
Copy link
Member Author

achimnol commented Feb 25, 2018

  1. My PR handles such cases using ChainMap (see the test case test_nested_dynamic_subapp_resolution) -- .maps can serve the purpose of .parent you suggest, like the _apps internal property. I think .parent may not be very useful because from the subapp's perspective it is anyway "unknown" how many times it should repeat traversal recursively to get the "right" parent. I thought it is better to expose all maps as-is, like ChainMap already does. However, if you prefer .parent, I can rewrite it along with 2.

  2. The original code was deriving UrlMappingMatchInfo from dict so I followed that style. If you'd like to change this, I can rewrite it.

@achimnol
Copy link
Member Author

achimnol commented Feb 26, 2018

To change UrlMappingMatchInfo to be a self-nested data structure (e.g., to use match_info.parent.parent), I think we need to rewrite its apps property handling as well because it also treats the list of nested apps as its own property rather than having a single reference to the current app -- so that the user code becomes match_info.current_app / match_info.parent.current_app.

Also I still don't get the benefit of having multiple per-app-level match_info objects for a single resolution operation with duplicate references to routes for each of them since sub-app resources are not treated as final destinations.

I think it is sufficient to have nested maps via ChainMap just like apps property. 😕

@asvetlov
Copy link
Member

I've convinced. Let's use chained map approach.
But please wait for my careful review.

Copy link
Member

@asvetlov asvetlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for delay and thank you for patience.

I have some comments.
Let's discuss my proposals.

@@ -172,12 +172,13 @@ def url_for(self, *args, **kwargs):
return await self._expect_handler(request)


class UrlMappingMatchInfo(dict, AbstractMatchInfo):
class UrlMappingMatchInfo(AbstractMatchInfo, Mapping):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the fact that UrlMappingMatchInfo is immutable now.


def __init__(self, match_dict, route):
super().__init__(match_dict)
super().__init__()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super() call is not needed, Python machinery will work fine without it.

return tuple(reversed(self._variables.maps))

def add_variables(self, match_dict):
self._variables.maps.append(match_dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you should insert a new dict in front of maps -- like we do it for add_app().
Moreover maybe merging add_app() and add_variables() functionality makes sense.
What do you think about _add_app(app, match_dict) method as replacement for add_app.
Adding underscore emphasizes that the method is private. add_app() was never a part of documented public API, I think the change is safe in backward compatibility perspective.

Copy link
Member Author

@achimnol achimnol Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChainMap resolves from the beginning of its maps. The point here is that add_variables() is called after resolving the subapp's routes, which means that subapp's match variables are added to the chain map first. Since we want to give priority to the innermost subapp's variables in the merged view of this chain map, this append is natural.

I think merging this method into add_app(app, match_dict) is a good idea. 👍

Copy link
Member Author

@achimnol achimnol Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.... merging into a single method causes a lot of problem:

  • When instantiating UrlMappingMatchInfo we already have match_dict (so the chain map begins with a single layer) but _app is None and _apps is an empty tuple. This makes the number of nested maps and apps inconsistent.
  • Changing UrlMappingMatchInfo's constructor to accept an explicit app argument requires Route and Resource classes to keep track of their owner app because resolve() method should be able to indicate the exact app instance to create the match info object. request.app is a self null reference there.
  • add_app() is called at many other places such as test_utils.py and web_app.py where match_dict is out-of-scope.

The root cause of this mess is that app and match_dict are added to UrlMappingMatchInfo at different places and timings. 😕

def __repr__(self):
return "<MatchInfo {}: {}>".format(super().__repr__(), self._route)
return "<MatchInfo {}: {}>".format(dict(self).__repr__(), self._route)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repr(self._variables) instead of repr(dict(self)) is better I believe. There is no reason to hide chained variables.

Copy link
Member Author

@achimnol achimnol Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to make existing test cases that compares the repr result with a constant string to pass. Then I will change those test cases to match with ChainMap's repr result.

@@ -230,8 +231,24 @@ def set_current_app(self, app):
def freeze(self):
self._frozen = True

@property
def variable_maps(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's use just maps name?
maps is shorted, in documentation we could refer to ChainMap.maps. maps is mentally more familiar to people if they know about ChainMap already.

Copy link
Member Author

@achimnol achimnol Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also related to the fact that resolution order of the chain map and its build order are reversed. The use of reversed() here is to be consistent with the order of the return value of apps property.
Also I wanted to make it read-only since ChainMap's maps property is mutable -- so I've used a different name because they are not the same. Still, I'd like to change the name to maps if you still want it. :) If so, I will update the docs as well.

@@ -2181,8 +2189,8 @@ In general the result may be any object derived from

.. class:: UrlMappingMatchInfo

Inherited from :class:`dict` and :class:`AbstractMatchInfo`. Dict
items are filled by matching info and is :term:`resource`\-specific.
Inherited from :class:`collections.ChainMap` and :class:`AbstractMatchInfo`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc is not relevant anymore, Mapping should replace ChainMap.
Also .variable_maps or .maps attribute should be documented.

def add_map(self, match_dict):
if self._frozen:
raise RuntimeError("Cannot change maps stack after .freeze() call")
self._variables.maps.append(match_dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's consider the following:

async def handler(request):
    return web.Response(text=request.math_info['name'])

subapp = web.Application()
subapp.add_routes([web.get('/{name}', handler)])

app = web.Application()
app.add_subapp('/{name}', subapp)

What name should be returned by the handler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, request.match_info['name'] in the handler will return the matched value of the innermost {name} definition: the route one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect!

self._app = app
self._prefix = prefix

def url_for(self, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see more complex problem.
.url_for() for all resources from sub-application are broken for dynamic prefixed resource.

Sorry, but the problem is blocker for merging.

Copy link
Member Author

@achimnol achimnol Mar 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I mentioned in the gitter channel.
I think rewriting url_for() to have a recursive architecture so that route resources of different types in different subapp levels could be chained requires a lot of work, so I'd like to suggest it as a separate PR.
Maybe we could mark this PR as "provisional" if merged before it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently very busy due to my business work -- once I get some time after a few weeks, I'll give a look on the extension plan for url_for().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that without url_for fix not only url_for() for subapp itself doesn't work but all url_for() calls for subapp resources are broken.
That's why I think everything should be done in single PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree with you if possible it's better to make the changes in a single PR.
While I have to be a little bit away from this PR right now, other people's contributions to this PR are welcome!

@achimnol achimnol force-pushed the dynamic-subapp-prefix branch from 4d33cba to 50ac11d Compare March 8, 2018 05:34
@achimnol achimnol mentioned this pull request Mar 15, 2018
3 tasks
@achimnol
Copy link
Member Author

achimnol commented Mar 15, 2018

Implemented support for url_for().
There is an issue to decide on: unlike the chained match_info, the **kwargs passed to url_for() does not distinguish which key is for which subapp prefix.

  • Let it just do what it does now. Existing code should work as-is.
  • Add a mandatory property name to PrefixedSubAppResource and DynamicSubAppResource.
    Let users prefix specific keys with those names, e.g., "myprefix.version" and "version" will treated differently. In this case, the chained match_dict may be also replaced with a flat dict with prefixed keys. But this requires a little change to existing codes that use subapps.

@asvetlov What do you think?

@achimnol
Copy link
Member Author

Ah, since dynamic prefix is not a released API yet, I think it is safe to make the proposed name-and-predixed-keys scheme optional.
If the user wants to avoid key collision then they can use this optional name prefixes, but when not, they can just use as-is. No breaking for existing codes. Yay.


def get_info(self):
return {'app': self._app,
'prefix': self._prefix}

async def resolve(self, request):
if not request.url.raw_path.startswith(self._prefix):
if not request.url.path.startswith(self._prefix):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not raw_path?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the prefix includes arbitrary unicode characters?

Copy link
Member

@asvetlov asvetlov Mar 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixing PCT encoded and decoded data can make a mess easy.
I believe we should follow the strategy:

  1. Convert all user provided path, prefix etc into normalized (PCT-encoded actually) form by using yarl. Already normalized paths passed as-is by yarl.
  2. Normalize given HTTP path (we do it).
  3. Match normalized path to already pre-normalized patterns.
  4. Return request.match_info['name'] in PCT-decoded form because users most likely want getting 이름 instead of %EC%9D%B4%EB%A6%84

If aiohttp code violates the rule somewhere -- it should be fixed.
Do you agree?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I agree to keep the rule consistent. Let me inspect and fix up this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think then PlainResource.__init__() also need this, like PrefixResource:

self._path = URL.build(path=path).raw_path

@achimnol
Copy link
Member Author

achimnol commented Mar 15, 2018

Now if we have a nested dynamic subapp chain:

root_app = web.Application()
outer_app = web.Application()
inner_app = web.Application()
resource = inner_app.router.add_resource('/{var}')
route = resource.add_route('GET', ...)
outer_app.add_subapp('/{var}', inner_app, name='b')
root_app.add_subapp('/{var}', outer_app, name='a')

Then match_dict after resolving "/abc/def/ghi" has the content:

{
  'a.var': 'abc',
  'b.var': 'def',
  'var': 'ghi',
}

route.url_for(**{'a.var': '123', 'b.var': '456', 'var': '789'}) gives:

'/123/456/789'

When inner_app and outer_app has beend added without names,
the match result contains only the innermost variable:

{
  'var': 'ghi'
}

and route.url_for(var='xxx') gives:

'/xxx/xxx/xxx'

No more ChainMap!

@achimnol achimnol force-pushed the dynamic-subapp-prefix branch from ed5565e to 3aa48f7 Compare March 15, 2018 16:53
@achimnol
Copy link
Member Author

Rebased against latest master.

@achimnol
Copy link
Member Author

I think it's ready to get reviewed. :)

@asvetlov
Copy link
Member

Let me meditate for a while.
"name." looks... strange. How to manage a situation if name argument for app.add_subapp was not passed?
Another thing worries me is: suppose I have a common-usable application like admin or debug toolbar.
I'm adding it with variable prefix and the subapp becomes broken: it has a knowledge about its required parents for generating urls.
The worse thing is that breakage appears on subapp usage (url generations for filing HTTP templates), not on subapp adding.

If you have ideas how to make it steady -- please share your vision.
We a free to invite a new API or even restrict subapp type to VariablePrefixAwareSubapp inherited from web.Application but the game rules should be discussed and fixed in stone before the PR merging.

I understand your striving to make subapp prefix flexible but my guts feel that we should be very careful with chosen design. Otherwise fitting a opened can of worms will be painful.

I suggest excluding the PR from aiohttp 3.1 but working hard to publish it as part of aiohttp 3.2
Your idea for chaining match_info is brilliant.
There is a very visible request for providing chained info for nested applications:

P.S.
I'll go to Ho Chi Minh tomorrow evening for a couple weeks. Not sure if I will have enough time to work on aiohttp these dates.
The city is much closer to Seoul than Kiev but unfortunately still so far for meeting with you offline :(

@achimnol
Copy link
Member Author

achimnol commented Mar 17, 2018

I see. You want more isolation between apps -- and I agree with that.
I thought having explicit names was better than nested scopes because it is clear to read.

I'm fine to go back to the first ChainMap design for match_info, keeping my "recursive" implementation.
The problem is url_for().

To allow use url_for() in the innermost ("leaf") app without knowing whatever variables are included in its outer subapp prefixes, then I think we need a strong assumption that url_for() is only used inside request handlers where match_info is available in the current task-local context.
Once url_for() has access to the nested match_info's ChainMap then it can fill the prefix variables using the matched values of the current request's path.

Still we have two issues:

  1. Python asyncio currently lacks task-local context support which automatically propagates to subtasks. To workaround this, it may be necessary to change the url_for() API to accept the request object (since it serves the role of handler-local context now) or its match_info attribute. But this would break use of url_for() in template rendering. access subapp routes via <app-name>:<route-name> #2314's request.url_for() may be an option here.
  2. url_for() may be used to refer URLs with prefixes other than the current request's path has. In this case users need to provide the prefix variables anyway, like my "name." design (of course the concrete method may be different).

@asvetlov What do you think?

@achimnol
Copy link
Member Author

achimnol commented Mar 17, 2018

More thoughts:
Maybe could we ingore the 2nd case if we stick with "isolated" apps?
If each app never calls url_for() against routes in other apps, then filling the prefix variables with the current request's path should be ok. But in the reality, like in one of my Django projects, cross-reference to URLs in other apps in one app's template rendering often happens because a single rendered view of webpage usually involves multiple apps (e.g., dashboard, user authentication, posts, comments, etc.). Nonetheless it is also possible to argue that such cases must be implemented as a single monolithic app — there is no reason to match the semantic of aiohttp's app with Django's app.

See below!

@achimnol
Copy link
Member Author

achimnol commented Mar 17, 2018

More more thoughts:
When composing multiple apps, cross-reference to URLs in other apps usually happen in a specific direction only: from outer apps to inner apps, not to parent/sibling apps.

From the perspective of inner apps, it's safe to assume that the aiohttp's web framework will take care of filling outer prefix variables with the handler-local match_info context.

From the perspective of outer apps, they need to explicitly give the value of prefix variables when calling url_for() of routes of inner apps. In this specific case, we need a way to distinguish the duplicate variable names in the prefix names.

@asvetlov
Copy link
Member

Maybe name prefixes are not bad but I need a time to experiment with.
Regarding to mandatory params from outer apps etc:

  1. We can invite a new API for it, something like request.build_url() (the name is a subject for changing)
  2. Context local variables (implicit context) are not required as we always have explicit context as request.
  3. Inner/outer applications can be useful terminology.

achimnol added 20 commits March 22, 2018 17:30
* This behavior should be consistent with the constructor.
* So let it add an empty dict when nesting.
* Let UrlMappingMatchInfo inherit collections.abc.Mapping instead of
  ChainMap directly.

  - Make the order of inheritance consistent with UrlDispatcher.

* Make the order of the nested dictionaries returned by "variable_maps"
  consistent with the order of the nested apps returned by "apps".
* Kept add_map() (formerly add_variables) and add_app() methods
  sepearte since they are often called in disjoint contexts where the
  other is out-of-scope.

* Renamed variable_maps to maps.

* Updated the docs related to add_subapp(), MatchInfo, and
  DynamicSubAppResource.
* Changed add_prefix() to accept resource instances instead of prefix
  strings.

* url_for() now recursively call the parent's url_for() and prefix
  the current url with it.
  For this, url_for() with PrefixedSubAppResoucre/DynamicSubAppResource
  are no longer errors.

* Changed PrefixedSubAppResource.resolve() to also clone the request
  with the stripped URL like DymaicSubAppResource.resolve().

* This implementation does not break any existing code. :)

* TODO: I will add some test cases and extra checks to the code.
* Fixed some unimplemented/mistaken parts of the previous commit
* Reverted back to use 'raw_path' where it had been used before
  as it still works as desired.
* All paths (even regexs against them) stored in resources are already
  encoded.  They are only decoded when passed back to the user.

* PlainResource was missing this explicit encoding when initialized.
@CLAassistant

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants