Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: Is it a good way to user ClientSession like this? #3604

Closed
yjqiang opened this issue Feb 11, 2019 · 11 comments
Closed

Q: Is it a good way to user ClientSession like this? #3604

yjqiang opened this issue Feb 11, 2019 · 11 comments

Comments

@yjqiang
Copy link

yjqiang commented Feb 11, 2019

class Crawler:
    def __init__(self, roots,
                 exclude=None, strict=True,  # What to crawl.
                 max_redirect=10, max_tries=4,  # Per-url limits.
                 max_tasks=10, *, loop=None):
        ...
        self._session = None

    @property
    def session(self):
        if self._session is None:
            self.session = aiohttp.ClientSession(loop=self.loop)
            self._session = session
        return self._session

    async def one_requests(self):
        async with self.session.get(url) as rsp:
             return await rsp.json()

I want to use one ClientSession to handle many requests(same host).

@aio-libs-bot
Copy link

GitMate.io thinks the contributor most likely able to help you is @asvetlov.

Possibly related issues are #2861 (Is there any way to not re-encode url passed to ClientSession._request() ?), #329 (Unittests for ClientSession), #2451 (Documentation for ClientSession), #3245 (good design!), and #752 (There is no way to 'del_route').

@asvetlov
Copy link
Member

class Crawler:
    def __init__(self, roots,
                 exclude=None, strict=True,  # What to crawl.
                 max_redirect=10, max_tries=4,  # Per-url limits.
                 max_tasks=10, *, loop=None):
        ...
        self._session = aiohttp.ClientSession()

   async def close(self):
       await self._session.close()

   async def main():
        crawler = Crawler()
        try:
            await crawler.crawl()
        finally:
            await crawler.close()

    asyncio.run(main())

or use async with Crawler() as crawler: but you need to implement __aenter__ / __aexit__.

@yjqiang
Copy link
Author

yjqiang commented Feb 14, 2019

But I have a problem. I must create an object in the main thread and share the object with another thread ( But don't worry, I use asyncio.run_coroutine_threadsafe(coro, loop) to make methods thread-safe).
But if I do like this. Is it ok? I mean creating a thread in a coroutine function and start in the function?

class Crawler:
    def __init__(self, roots,
                 exclude=None, strict=True,  # What to crawl.
                 max_redirect=10, max_tries=4,  # Per-url limits.
                 max_tasks=10, *, loop=None):
        ...
        self._session = aiohttp.ClientSession()

    async def close(self):
       await self._session.close()

async def main():
    crawler = Crawler()
    console_thread = threading.Thread(target=BiliConsole(loop, crawler).execute)
    console_thread.start()
    try:
        await crawler.crawl()
    finally:
        await crawler.close()

asyncio.run(main())

@asvetlov
Copy link
Member

It can work (I don't know what is BiliConsole though).
Obvious non-accurate point is missing console_thread.join() call.
You cannot do it inside async def main() because join() is a blocking call.
Looks like your code structure is not in ideal shape, sorry.
It can work pretty well for you but the code has a bad smell.

@yjqiang
Copy link
Author

yjqiang commented Feb 14, 2019

It can work (I don't know what is BiliConsole though).
Obvious non-accurate point is missing console_thread.join() call.
You cannot do it inside async def main() because join() is a blocking call.
Looks like your code structure is not in ideal shape, sorry.
It can work pretty well for you but the code has a bad smell.

Yeah, so can you help me to make it better? It confused me for a week.

@yjqiang
Copy link
Author

yjqiang commented Feb 14, 2019

Or this?

class Crawler:
    def __init__(self, roots,
                 exclude=None, strict=True,  # What to crawl.
                 max_redirect=10, max_tries=4,  # Per-url limits.
                 max_tasks=10, *, loop=None):
        ...
        self._session = aiohttp.ClientSession()

    async def close(self):
       await self._session.close()


async def create():
    crawler = Crawler()
    return crawler

crawler = loop.run_until_complete(create())
console_thread = threading.Thread(target=BiliConsole(loop, crawler).execute)
console_thread.start()

async def main():
    try:
        await crawler.crawl()
    finally:
        await crawler.close()

asyncio.run(main())

@yjqiang
Copy link
Author

yjqiang commented Feb 17, 2019

It can work (I don't know what is BiliConsole though).
Obvious non-accurate point is missing console_thread.join() call.
You cannot do it inside async def main() because join() is a blocking call.
Looks like your code structure is not in ideal shape, sorry.
It can work pretty well for you but the code has a bad smell.

I still can't make it. I must create an object in the main thread and share the object with another thread ( I input sth in this thread to control the tasks running in the main thread). So is there a good way to make this? I have to figure it about. So please help me.

@yjqiang
Copy link
Author

yjqiang commented Feb 18, 2019

@asvetlov I think I made it. Would you like to have a look and give me some advice about this?

import asyncio
import aiohttp


async def run(future: asyncio.Future):
    object = Crawler()
    future.set_result(object)
    print('created an object')
    
    # run_tasks … 
    await asyncio.sleep(5)

    await object.close()
    print('DONE') 


async def get_object():
    future = asyncio.Future()
    asyncio.ensure_future(run(future))
    await future
    return future.result()

    
loop = asyncio.get_event_loop()

object = loop.run_until_complete(get_object())

# use the object to initial another thread and run …

# keep running run(future: asyncio.Future)
loop.run_forever()

@yjqiang
Copy link
Author

yjqiang commented Feb 23, 2019

@asvetlov Hi, would you like to check my solution and help me to improve?
#3604 (comment)

@samuelcolvin
Copy link
Member

@yjqiang neither @asvetlov nor anyone else here can be expected to review your code like this.

It looks broadly okay, beyond that it's up to you to write and review your own code.

@yjqiang
Copy link
Author

yjqiang commented Feb 25, 2019

Maybe this is the best way I can find. Anyway thank you, @samuelcolvin and @asvetlov.

@lock lock bot added the outdated label Feb 28, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Feb 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants