Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace aiohttp.ClientSession with aiohttp.create_session() #2473

Closed
asvetlov opened this issue Nov 7, 2017 · 36 comments
Closed

Replace aiohttp.ClientSession with aiohttp.create_session() #2473

asvetlov opened this issue Nov 7, 2017 · 36 comments

Comments

@asvetlov
Copy link
Member

asvetlov commented Nov 7, 2017

create_session() should accept the same params as ClientSession but return a oject with __aenter__/__aexit__ and __await__ methods.
Thus everything is supported except client = aiohttp.ClientSession() syntax: client = await aiohttp.create_session() should be used instead.

It prevents hard to debug things like making a session in global namespace:

class A:
     client = aiohttp.ClientSession()

The change is backward compatible: we don't get rid of ClientSession object and even expose it by aiohttp.__all__.
But it's a huge change from user perspective: we will rewrite all documentation to encourage a new way.

@asvetlov asvetlov added this to the 3.0 milestone Nov 7, 2017
@argaen
Copy link
Member

argaen commented Nov 7, 2017

In our projects we currently use it as in your example:

class A:
     session = aiohttp.ClientSession()

This way we reuse the session in all the calls done and we can do it in the __init__ because no await is required. I have couple of questions regarding the proposal:

  • Why is hard to debug making the session inside the class namespace?
  • If the correct pattern in the future is to use create_session which requires await, what's the alternative to reuse sessions? create the session outside the class and propagate it through args? initialize the session lazily?
  • Why is it better the new way?

IMO as a user, unless I'm missing something I prefer initializing the class directly without having to worry if there is loop available when I create the session or that I'm inside an async function

@kxepal
Copy link
Member

kxepal commented Nov 7, 2017

@argaen
global shared mutable objects are not a good practice, no matter what. Just pass session as parameter to your class and create it somewhere before it.

@asvetlov
Copy link
Member Author

asvetlov commented Nov 7, 2017

In your case you should monkey-patch the session in test suites: every test is run in own test loop.
The best way to work with asyncio objects from coroutines only, in this way default loop (asyncio.get_event_loop()) is always the same as used for running async function in Python 3.5.3+.

BTW aiopg/aioredis and others works in this way already.

@argaen
Copy link
Member

argaen commented Nov 7, 2017

@kxepal its not a global shared object. This object is scoped inside the class that is going to use it which makes that class standalone and able to work without needing external setup.

Passing the session is an overhead, we may have 10-20 outbound services we need to contact from our stack. Having to create 10-20 sessions outside and passing them to their respective IMO is worse than just initializing the sessions inside its own class.

@asvetlov in this case I'm talking more about usage in real code rather in tests. About aioredis yes I'm aware, I'm creating the pools lazily inside the class for example https://github.com/argaen/aiocache/blob/master/aiocache/backends/redis.py#L19 https://github.com/argaen/aiocache/blob/master/aiocache/backends/redis.py#L190 which forces me to end up adding more logic than I would like to initialize the pool compared to if I had a way of doing it in my __init__

@asvetlov
Copy link
Member Author

asvetlov commented Nov 7, 2017

@argaen class variable is a global object, just like a module variable.
The class variable is created at module import time, that's it.

@kxepal
Copy link
Member

kxepal commented Nov 7, 2017

@argaen
It's an object which get shared across all the class instances. One instance can modify it and the rest ones will share these changes. That's a bad practice.

Passing the session as argument is not an overhead, but a good design, where you can control the session instance and decide what to pass: some special session with custom configuration for specific service or some common one which fits everyone. Testing also turns into fun.

@argaen
Copy link
Member

argaen commented Nov 7, 2017

@argaen class variable is a global object, just like a module variable.
The class variable is created at module import time, that's it.

Ah, sorry for the confusion but in our case we use it per class instance, not as class variable. We initialize the sessions in __init__

@kxepal in our case the configuration for sessions comes from config files which is pulled by the class initializing it.

@asvetlov
Copy link
Member Author

asvetlov commented Nov 7, 2017

In this case lazy evaluation is an option.
Or add async def configure(self) method for async post init (the method should be called just after class constructor).

@argaen
Copy link
Member

argaen commented Nov 7, 2017

My point here is that its True global shared variables are bad practices but here is where developer responsibility comes. IMO the public API shouldn't be restrictive and let the user decide what to do. Ofc if we keep ClientSession public without deprecating then I'm happy with it, I would like to keep it available for use cases like ours

In this case lazy evaluation is an option.
Or add async def configure(self) method for async post init (the method should be called just after class constructor).

Still extra work compared to just doing it in the __init__ calling ClientSession :)

@asvetlov
Copy link
Member Author

asvetlov commented Nov 7, 2017

I want deprecate ClientSession.
Deprecate but not remove, at least for a while.
Let's encourage people to use it properly.

@fafhrd91
Copy link
Member

fafhrd91 commented Nov 7, 2017

I'd be very careful with such change. technically that means that configuration of application can not be made outside of async context. this will complicate configuration stage and increase learning curve.
are there any evidence that current design cause problems? and tests are just confirmation of your code, they are not goal.

@asvetlov
Copy link
Member Author

asvetlov commented Nov 7, 2017

@fafhrd91 I have a gut feeling that we need the change.
My plan is not removing ClientSession support but hiding it under a new factory function.
Just like asyncio.open_connection or aiopg.create_engine.
Let's make it and look on feedback.
The change is backward compatible, we can revert it if the time will prove that I'm mistaking.

@fafhrd91
Copy link
Member

fafhrd91 commented Nov 7, 2017

I feel opposite, specifically to configuration stage.

you need to find better example than open_connection, open_connection is runtime function it is not configuration. you won't get any feedback, people would just curse library and continue to use or leave.

I still don't see any evidence that ClientSession is the problem.

@asvetlov
Copy link
Member Author

asvetlov commented Nov 7, 2017

@fafhrd91 do you suggest to close the issue?

I can live with it: https://github.com/aio-libs/aiohttp/blob/master/aiohttp/client.py#L82-L91 raises a warning if implicit loop is not running.

@fafhrd91
Copy link
Member

fafhrd91 commented Nov 7, 2017

I just want this change be part of design effort to build configuration story for aiohttp based projects, rather than ad-hoc change without clear pros.

@pfreixes
Copy link
Contributor

pfreixes commented Nov 7, 2017

The good point about the @asvetlov proposal is that makes the API more explicit and avoid the prone error pattern that we have right now, even though is protected with some loop checks. Therefore IMHO whatever that means make the API more consistency makes sense for me.

Regarding the comment from @fafhrd91 can you evolve a bit more the meaning of::

I just want this change be part of design effort to build configuration story for aiohttp based projects, rather than ad-hoc change without clear pros.

@fafhrd91
Copy link
Member

fafhrd91 commented Nov 8, 2017

main problem is, right now we do not have any consistent configuration boundaries. everything is ad-hoc. and I believe this big WTF at the moment. as opposite to runtime, in runtime everything works very well. this change moves session configuration to async context, so we would need two different type of context for configuration, which is sucks. I am fine with async context for configuration, but then aiohttp has to provide framework for it and properly define boundaries.

I believe, configuration stage should happen in sync context

@kxepal
Copy link
Member

kxepal commented Nov 8, 2017

@fafhrd91
When configuration get stored in something like Consul, you would like to have async configuration support rather than have two different clients to work with it. I don't think there is a big problem to have async configuration support by default. In the end, async function with no awaits behaves like regular one - it doesn't yields the loop.

@fafhrd91
Copy link
Member

fafhrd91 commented Nov 8, 2017

If you pulling confit from consul, then you probably very far in system design. I am more about simple projects. Creating simple scripts with aiohttp is wtf even for me.

@pfreixes
Copy link
Contributor

pfreixes commented Nov 8, 2017

@fafhrd91 something like this ?

from aiohttp import config

config.client_session('microservice', limit=10, ..)

async def foo():
    session = await aiohttp.session_from_config('microservice')

The good think of having the config load not in the runtime but in the starting time is allowing the aiohttp sever to fail fast in case of the config is broken that does not allow to start the process.

@fafhrd91
Copy link
Member

fafhrd91 commented Nov 8, 2017

@pfreixes I like this idea. I am not sure about arbitrary strings as session ids, but that is probably fine.
@asvetlov ?

this would also solve problem of client session misuse. often users create new client session for each request

@argaen
Copy link
Member

argaen commented Nov 8, 2017

Aren't we deriving the conversation a bit? I agree a config object would be nice to have but I see it a bigger change and maybe we need another issue to discuss it?

For now, being able to do session = ClientSession() at config time is enough at least for us and that's why I think the new way would just make things more difficult.

One thing I like to apply when coding with async. Don't make functions or interactions asynchronous unless there is a reason to, and by reason I mean another async call inside that interaction exists.

@pfreixes
Copy link
Contributor

pfreixes commented Nov 8, 2017

@argaen remember that the proposed change would be backward compatible, meaning that legacy code could still use the next aiohttp release without changing the code 😉

@argaen
Copy link
Member

argaen commented Nov 8, 2017

@asvetlov said ClientSession() would be deprecated (#2473 (comment)) so this means at some point there would only be an asynchronous mode of creating a new session. That's what I want to avoid

@pfreixes
Copy link
Contributor

pfreixes commented Nov 8, 2017

Deprecate does not mean get rid of something IMHO. In fact, the issue has a clear mention of that

The change is backward compatible: we don't get rid of ClientSession object and even expose it by aiohttp.all.

But obviously get rid of this interface would mean break many and many systems. Therefore I support your concern if this is the case at last.

@asvetlov
Copy link
Member Author

asvetlov commented Nov 9, 2017

What synchronous configuration are you talking about?
Any modern async program should have async def main() function to do all work.
Like any synchronous program has def main().
What the beast is sync configuration? Why it must be synchronous at all?

@fafhrd91
Copy link
Member

fafhrd91 commented Nov 9, 2017

we just need to replace def __init__(self): with async def __init__(self): and we could finally live in brave new async world!

I think this is bad decision. but you do not need to relay on my opinion, if other commuters agree this is good idea, go ahead and implement it.

@asvetlov
Copy link
Member Author

asvetlov commented Nov 9, 2017

@fafhrd91 async def __init__(...) breaks super(),__init__() call -- that's why I strongly don't support this kind of magic.
But in general I think the client session should be created inside async functions only.

You disagree with this design decision.
We should not rush, it's not the most important change.
Let's keep the issue open and return to discussion later.
create_session() is not showstopper for upcoming aiohttp 3.0 because the proposal doesn't require backward incompatible changes. We could apply it to 3.1 if will not give a quorum quickly.

P.S.
For example #2483 and #2475 are more dramatic: forcing users to await writes might break some code, but we have a great opportunity to do it because the next 3.0 release is very major.

@fafhrd91
Copy link
Member

fafhrd91 commented Nov 9, 2017

__init__ was a joke :)

I do not use much of aiohttp and python anymore, so do not strongly rely on me.

@asvetlov
Copy link
Member Author

asvetlov commented Nov 9, 2017

@fafhrd91
Copy link
Member

fafhrd91 commented Nov 9, 2017

yes, rust. I am working on async web framework, it already has ws and http2 support. and it is fast.

@pfreixes
Copy link
Contributor

@asvetlov let me reuse this thread to ask you something about the following message

>>> from aiohttp import ClientSession
>>> session = ClientSession()
__main__:1: UserWarning: Creating a client session outside of coroutine is a very dangerous idea
Creating a client session outside of coroutine
client_session: <aiohttp.client.ClientSession object at 0x1032fa6d8>

I have two questions:

  1. If the developer wants to reuse the connections and does not want to close the session at the end of the context, how the user can make it programmatically without having this annoying message?

  2. Which is the rationale against the synchronous instantiation?

@rooterkyberian
Copy link

The quick fix for "annoying message" is to explicitly pass loop param to ClientSession if you want to create it synchronously.

@RunningToTheEdgeOfTheWorld

In your docs :

Don’t create a session per request. Most likely you need a session per application which performs all requests altogether.More complex cases may require a session per site, e.g. one for Github and other one for Facebook APIs. Anyway making a session for every request is a very bad idea.

if you want "a session per application", so its not convenience to use context manager, and if you pass loop param, it will get some circle import problem.

@asvetlov
Copy link
Member Author

Why context manager?

async def on_startup(app):
    app['session'] = ClientSession()

async def on_cleanup(app):
     await app['session'].close()

@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants