Convert some views to native async. #2720

ttys0dev · 2023-05-03T05:14:19Z

We're a bit limited in which handlers can actually use native asyncio until #2703 is merged since django 3.2 does not support async ORM operations. This is an alternative approach to #2701.

ttys0dev · 2023-05-10T16:17:12Z

This is probably ready for an initial review, I migrated most of the celery codepaths in the process_recap_upload view to asyncio. So far I haven't re-added any retry logic as it's a bit unclear what exactly commonly fails and actually needs retries(I also suspect the celery retries are not particularly reliable).

mlissner · 2023-05-10T16:43:35Z

Thanks. I'll put it on my backlog.

mlissner · 2023-05-26T06:12:31Z

Seems like switching to gevent is a smaller move. If we do that in #2701, can we close this?

ttys0dev · 2023-05-26T06:19:52Z

Seems like switching to gevent is a smaller move.

I can spin a minimal version of the ASGI migration without migrating the views(migrating the views can be done later as needed).

If we do that in #2701, can we close this?

l'm thinking this should have better performance/reliability as it avoids monkey patching and instead natively implements asynchronous IO handling(and uses a thread pool as a fallback for non-native asyncio functions).

mlissner · 2023-05-26T06:28:02Z

l'm thinking this should have better performance/reliability

Great, so #2701 can be closed? :)

ttys0dev · 2023-05-26T21:35:44Z

Great, so #2701 can be closed? :)

Yeah, closed that out, there's a number of bugs(such as compatibility issues related to monkeypatching and with multiprocessing) I came across with gevent that don't seem to have good fixes.

The minimal version of this PR which is basically just the WSGI to ASGI server migration part without optimizing the views to be native asyncio is what's in #2783. Hopefully that's a lot easier to review than this which combined the WSGI to ASGU server migration in addition to optimizing some of the views.

Once #2783 is merged the views can then be incrementally migrated to support native asyncio as needed. The nice thing is that ASGI should be fully backwards compatible with the existing sync views so that we can do this on an as needed basis over time.

ttys0dev · 2023-06-04T08:12:13Z

rebased

mlissner · 2023-06-07T05:12:01Z

Thanks for keeping this ready to go. I took a quick look through it, but I don't have time to think about it carefully (I'm on vacation for two weeks starting Saturday). I did some reading about async, but I think it'd be good to have @albertisfu look through it as well.

A couple questions:

How did you pick the views/code you did? It seems sort of random.
What's up with all the type ignoring in the urls.py file?
Why not make all the microservices async instead of giving them the sync_to_async treatment (they seem like a priority)?

ttys0dev · 2023-06-07T06:52:15Z

How did you pick the views/code you did? It seems sort of random.

It's basically pdf file upload codepaths plus other ones that didn't seem to have interdependencies that would require further conversions.

What's up with all the type ignoring in the urls.py file?

It seems that django-stubs doesn't properly handle typing for async views.

Why not make all the microservices async instead of giving them the sync_to_async treatment (they seem like a priority)?

What microservices are you referring to exactly?

albertisfu

I've reviewed this and learned how to utilize asynchronous support to enhance performance on I/O-bound tasks.

Overall this looks good to me. Troller is working well after changes done here.

I only wrote a question above related to celery tasks that were converted to async functions.

And about the microservices that Mike is referring to are these ones:
https://github.com/freelawproject/courtlistener/blob/main/cl/lib/microservice_utils.py

cl/recap/tasks.py

ttys0dev · 2023-07-03T00:26:29Z

rebased

ttys0dev · 2023-07-03T01:00:11Z

Processing and parsing HTML. This is usually fast for most types of objects, but notably can be slow for longer docket and docket history reports.

Any idea how slow we are talking here? I'd say unless we're sure this is going to cause issues then we can defer it and optimize down the line as there are multiple potential strategies we can use to deal with it.

mlissner · 2023-07-07T21:23:38Z

Just catching up on the state of play here. I want to re-read the PR because parts of it make me a bit nervous, but I can reply to this now:

Any idea how slow we are talking here? I'd say unless we're sure this is going to cause issues then we can defer it and optimize down the line as there are multiple potential strategies we can use to deal with it.

Probably maxes out at around 10s, I'd guess? Usually would take about half a second?

Separately, for your retries could you do it with our retry decorator? It should be tidier than the loop you've got: cl.lib.decorators.retry?

ttys0dev · 2023-07-07T21:45:23Z

Probably maxes out at around 10s, I'd guess? Usually would take about half a second?

Sounds like it should probably be ok without a background task.

Separately, for your retries could you do it with our retry decorator? It should be tidier than the loop you've got: cl.lib.decorators.retry?

It's not really asyncio compatible, this would block the event loop.

ttys0dev · 2023-07-08T22:11:53Z

Separately, for your retries could you do it with our retry decorator? It should be tidier than the loop you've got: cl.lib.decorators.retry?

By the way I did look at potentially adding an async variant of this decorator, however I think it's probably best to defer refactoring this at the moment. The retry mechanism with celery worked AFAIU by retrying the entire function, which is similar to cl.lib.decorators.retry which also retries the entire function.

I don't think a retry the entire function decorator is good alternative to the retry loop approach I used as we do want to do a more granular retry operation on a specific codeblock within a function(ie for performance to avoid unnecessarily repeating queries that come before the codeblock we want to retry).

Using a simple retry loop like I did seems to be the standard approach for implementing granular retries with asyncio within a codeblock from what I can tell.

mlissner · 2023-07-18T20:41:23Z

I'm sorry, @ttys0dev can you fix the conflicts on this one and I'll take one more look at it?

ttys0dev · 2023-07-19T03:51:36Z

rebased

mlissner

OK, I re-did my whole review to see if I caught anything else, and I found a few more things.

The big picture is that I'm still pretty worried about moving these celery tasks to async, but I'm prepared to be convinced that it'll be OK, and I appreciate you pushing me/this forward despite my and others misgivings.

It's particularly weird to be moving celery processing tasks to the web tier, for example, instead of our celery tier, but I guess that's OK. There are things I'll have to learn though. Like right now if I want to re-process 1,000 PDFs, I just do:

for pdf in pdfs:
    process_recap_pdf.delay(pdf)

And that creates 1,000 tasks in Celery to get it all done. I guess there's a good way to do this with async too.

cl/lib/thumbnails.py

mlissner · 2023-07-19T19:10:13Z

cl/recap/tasks.py

-    interval_step=10 * 60,
-)
-def process_recap_pdf(self, pk):
+async def process_recap_pdf(pk):


OK, I'm going to try this one more time, just so you can tell me that, "Yes, Mike, this is fine."

I have two concerns here:

process_recap_pdf can do OCR and OCR can take a really long time. The longest is something like a day, but generally it's done within a minute or two. Sometimes it can take a lot longer though.

Given that that's the case, can we count on a client being around to receive a response? And if the client wanders away, could that screw up the processing?

In other words, if this method takes an hour and the user throws their laptop into a lake after a few seconds, is that OK?

We have two types of users for the RECAP upload API. The extension (via browsers), and API users. The API users are used to getting nearly instant responses, and this will make their responses much slower. Won't that screw up the speed that clients can do uploads via servers, if they don't currently have a multi-threaded way of uploading? That's a big change in how this API works.

Given that that's the case, can we count on a client being around to receive a response? And if the client wanders away, could that screw up the processing?

I think it should be ok.

In other words, if this method takes an hour and the user throws their laptop into a lake after a few seconds, is that OK?

Probably no worse than the current situation where processing can fail somewhat randomly even when there is a success response to the initial upload, moving OCR out of the request path and into background job can always be done if it's an issue.

The API users are used to getting nearly instant responses, and this will make their responses much slower.

Well right now the upload responses are mostly useless AFAIU, they don't accurately indicate if the request was successful. Having accurate response status codes is going to be something API users are likely to want.

Won't that screw up the speed that clients can do uploads via servers, if they don't currently have a multi-threaded way of uploading?

It depends on the exact implementation of the upload client, keep in mind not having useful responses is problematic in general for API users.

cl/recap/tasks.py

ttys0dev · 2023-07-19T20:27:06Z

It's particularly weird to be moving celery processing tasks to the web tier, for example, instead of our celery tier, but I guess that's OK.

Yeah, so celery seems to be more for background jobs that don't need say a success status response to be returned to the requester.

And that creates 1,000 tasks in Celery to get it all done. I guess there's a good way to do this with async too.

You could just create a celery task that wraps the asyncio function using async_to_sync if you aren't running that through the normal web stack.

mlissner · 2023-07-20T13:18:46Z

OK, let's see how this goes. Merging. Thank you for sticking with it for so long. I'm worried about this will impact our extension users and our API users, but hopefully it'll be OK and a step in the right direction.

ttys0dev force-pushed the asgi branch 6 times, most recently from 3fe0584 to 3bd4184 Compare May 8, 2023 22:26

ttys0dev force-pushed the asgi branch 4 times, most recently from 2ca979a to a816fc5 Compare May 10, 2023 09:07

ttys0dev mentioned this pull request May 15, 2023

Various gunicorn/timeout related upload failure bugs. #2640

Closed

ttys0dev force-pushed the asgi branch from a816fc5 to 5f9e71f Compare May 15, 2023 22:37

ttys0dev mentioned this pull request May 26, 2023

Migrate from WSGI to ASGI #2783

Merged

ttys0dev force-pushed the asgi branch 2 times, most recently from 88bf61d to 1a06a17 Compare June 4, 2023 07:54

ttys0dev changed the title ~~Migrate from WSGI to ASGI~~ Convert some views to native async. Jun 4, 2023

ttys0dev force-pushed the asgi branch 2 times, most recently from 7e6e229 to 375857a Compare June 4, 2023 07:58

ttys0dev force-pushed the asgi branch from 375857a to fef689c Compare June 7, 2023 03:45

albertisfu self-requested a review June 7, 2023 17:52

albertisfu reviewed Jun 7, 2023

View reviewed changes

cl/recap/tasks.py Show resolved Hide resolved

ttys0dev force-pushed the asgi branch 7 times, most recently from 0f99c3a to 12c0e49 Compare June 9, 2023 00:56

ttys0dev force-pushed the asgi branch from 12c0e49 to 585abe8 Compare June 16, 2023 01:08

ttys0dev force-pushed the asgi branch from 585abe8 to 35d777b Compare July 3, 2023 00:05

ttys0dev force-pushed the asgi branch from 35d777b to 27f26c0 Compare July 6, 2023 01:02

ttys0dev force-pushed the asgi branch from 27f26c0 to 91d5cfa Compare July 7, 2023 21:38

ttys0dev force-pushed the asgi branch from 91d5cfa to 825fe96 Compare July 19, 2023 03:33

mlissner requested changes Jul 19, 2023

View reviewed changes

mlissner mentioned this pull request Jul 19, 2023

build(deps-dev): bump word-wrap from 1.2.3 to 1.2.4 in /cl #2915

Closed

ttys0dev force-pushed the asgi branch from 825fe96 to 97ed9d8 Compare July 19, 2023 20:31

Convert some views to native async.

8eaba3f

ttys0dev force-pushed the asgi branch from 97ed9d8 to 8eaba3f Compare July 19, 2023 21:52

mlissner approved these changes Jul 20, 2023

View reviewed changes

mlissner merged commit 8bfa694 into freelawproject:main Jul 20, 2023

ttys0dev deleted the asgi branch July 20, 2023 14:10

mlissner mentioned this pull request Jul 28, 2023

OCR must not be in the RECAP upload path #2963

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert some views to native async. #2720

Convert some views to native async. #2720

ttys0dev commented May 3, 2023

ttys0dev commented May 10, 2023

mlissner commented May 10, 2023

mlissner commented May 26, 2023

ttys0dev commented May 26, 2023

mlissner commented May 26, 2023

ttys0dev commented May 26, 2023

ttys0dev commented Jun 4, 2023

mlissner commented Jun 7, 2023 •

edited

Loading

ttys0dev commented Jun 7, 2023

albertisfu left a comment

ttys0dev commented Jul 3, 2023

ttys0dev commented Jul 3, 2023

mlissner commented Jul 7, 2023

ttys0dev commented Jul 7, 2023

ttys0dev commented Jul 8, 2023

mlissner commented Jul 18, 2023

ttys0dev commented Jul 19, 2023

mlissner left a comment

mlissner Jul 19, 2023

ttys0dev Jul 19, 2023

ttys0dev commented Jul 19, 2023

mlissner commented Jul 20, 2023

Convert some views to native async. #2720

Convert some views to native async. #2720

Conversation

ttys0dev commented May 3, 2023

ttys0dev commented May 10, 2023

mlissner commented May 10, 2023

mlissner commented May 26, 2023

ttys0dev commented May 26, 2023

mlissner commented May 26, 2023

ttys0dev commented May 26, 2023

ttys0dev commented Jun 4, 2023

mlissner commented Jun 7, 2023 • edited Loading

ttys0dev commented Jun 7, 2023

albertisfu left a comment

Choose a reason for hiding this comment

ttys0dev commented Jul 3, 2023

ttys0dev commented Jul 3, 2023

mlissner commented Jul 7, 2023

ttys0dev commented Jul 7, 2023

ttys0dev commented Jul 8, 2023

mlissner commented Jul 18, 2023

ttys0dev commented Jul 19, 2023

mlissner left a comment

Choose a reason for hiding this comment

mlissner Jul 19, 2023

Choose a reason for hiding this comment

ttys0dev Jul 19, 2023

Choose a reason for hiding this comment

ttys0dev commented Jul 19, 2023

mlissner commented Jul 20, 2023

mlissner commented Jun 7, 2023 •

edited

Loading