-
-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert some views to native async. #2720
Conversation
3fe0584
to
3bd4184
Compare
2ca979a
to
a816fc5
Compare
This is probably ready for an initial review, I migrated most of the celery codepaths in the |
Thanks. I'll put it on my backlog. |
Seems like switching to gevent is a smaller move. If we do that in #2701, can we close this? |
I can spin a minimal version of the ASGI migration without migrating the views(migrating the views can be done later as needed).
l'm thinking this should have better performance/reliability as it avoids monkey patching and instead natively implements asynchronous IO handling(and uses a thread pool as a fallback for non-native asyncio functions). |
Great, so #2701 can be closed? :) |
Yeah, closed that out, there's a number of bugs(such as compatibility issues related to monkeypatching and with multiprocessing) I came across with gevent that don't seem to have good fixes. The minimal version of this PR which is basically just the WSGI to ASGI server migration part without optimizing the views to be native asyncio is what's in #2783. Hopefully that's a lot easier to review than this which combined the WSGI to ASGU server migration in addition to optimizing some of the views. Once #2783 is merged the views can then be incrementally migrated to support native asyncio as needed. The nice thing is that ASGI should be fully backwards compatible with the existing sync views so that we can do this on an as needed basis over time. |
88bf61d
to
1a06a17
Compare
7e6e229
to
375857a
Compare
rebased |
Thanks for keeping this ready to go. I took a quick look through it, but I don't have time to think about it carefully (I'm on vacation for two weeks starting Saturday). I did some reading about async, but I think it'd be good to have @albertisfu look through it as well. A couple questions:
|
It's basically pdf file upload codepaths plus other ones that didn't seem to have interdependencies that would require further conversions.
It seems that
What microservices are you referring to exactly? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed this and learned how to utilize asynchronous support to enhance performance on I/O-bound tasks.
Overall this looks good to me. Troller is working well after changes done here.
I only wrote a question above related to celery tasks that were converted to async functions.
And about the microservices
that Mike is referring to are these ones:
https://github.com/freelawproject/courtlistener/blob/main/cl/lib/microservice_utils.py
0f99c3a
to
12c0e49
Compare
rebased |
Any idea how slow we are talking here? I'd say unless we're sure this is going to cause issues then we can defer it and optimize down the line as there are multiple potential strategies we can use to deal with it. |
Just catching up on the state of play here. I want to re-read the PR because parts of it make me a bit nervous, but I can reply to this now:
Probably maxes out at around 10s, I'd guess? Usually would take about half a second? Separately, for your retries could you do it with our retry decorator? It should be tidier than the loop you've got: |
Sounds like it should probably be ok without a background task.
It's not really asyncio compatible, this would block the event loop. |
By the way I did look at potentially adding an async variant of this decorator, however I think it's probably best to defer refactoring this at the moment. The retry mechanism with celery worked AFAIU by retrying the entire function, which is similar to I don't think a retry the entire function decorator is good alternative to the retry loop approach I used as we do want to do a more granular retry operation on a specific codeblock within a function(ie for performance to avoid unnecessarily repeating queries that come before the codeblock we want to retry). Using a simple retry loop like I did seems to be the standard approach for implementing granular retries with asyncio within a codeblock from what I can tell. |
I'm sorry, @ttys0dev can you fix the conflicts on this one and I'll take one more look at it? |
rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I re-did my whole review to see if I caught anything else, and I found a few more things.
The big picture is that I'm still pretty worried about moving these celery tasks to async, but I'm prepared to be convinced that it'll be OK, and I appreciate you pushing me/this forward despite my and others misgivings.
It's particularly weird to be moving celery processing tasks to the web tier, for example, instead of our celery tier, but I guess that's OK. There are things I'll have to learn though. Like right now if I want to re-process 1,000 PDFs, I just do:
for pdf in pdfs:
process_recap_pdf.delay(pdf)
And that creates 1,000 tasks in Celery to get it all done. I guess there's a good way to do this with async too.
interval_step=10 * 60, | ||
) | ||
def process_recap_pdf(self, pk): | ||
async def process_recap_pdf(pk): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'm going to try this one more time, just so you can tell me that, "Yes, Mike, this is fine."
I have two concerns here:
-
process_recap_pdf can do OCR and OCR can take a really long time. The longest is something like a day, but generally it's done within a minute or two. Sometimes it can take a lot longer though.
Given that that's the case, can we count on a client being around to receive a response? And if the client wanders away, could that screw up the processing?
In other words, if this method takes an hour and the user throws their laptop into a lake after a few seconds, is that OK?
-
We have two types of users for the RECAP upload API. The extension (via browsers), and API users. The API users are used to getting nearly instant responses, and this will make their responses much slower. Won't that screw up the speed that clients can do uploads via servers, if they don't currently have a multi-threaded way of uploading? That's a big change in how this API works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Given that that's the case, can we count on a client being around to receive a response? And if the client wanders away, could that screw up the processing?
I think it should be ok.
In other words, if this method takes an hour and the user throws their laptop into a lake after a few seconds, is that OK?
Probably no worse than the current situation where processing can fail somewhat randomly even when there is a success response to the initial upload, moving OCR out of the request path and into background job can always be done if it's an issue.
The API users are used to getting nearly instant responses, and this will make their responses much slower.
Well right now the upload responses are mostly useless AFAIU, they don't accurately indicate if the request was successful. Having accurate response status codes is going to be something API users are likely to want.
Won't that screw up the speed that clients can do uploads via servers, if they don't currently have a multi-threaded way of uploading?
It depends on the exact implementation of the upload client, keep in mind not having useful responses is problematic in general for API users.
Yeah, so celery seems to be more for background jobs that don't need say a success status response to be returned to the requester.
You could just create a celery task that wraps the asyncio function using |
OK, let's see how this goes. Merging. Thank you for sticking with it for so long. I'm worried about this will impact our extension users and our API users, but hopefully it'll be OK and a step in the right direction. |
We're a bit limited in which handlers can actually use native asyncio until #2703 is merged since django 3.2 does not support async ORM operations. This is an alternative approach to #2701.