-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize reddit/PRAW #1021
Comments
Found the culprit for the comment handler. We fetch the post so we can do OPD, and we also fetch comments (with Relevant commits:
|
for #1021, avoids massive inefficiency with big Reddit threads. uses new granary OPTIMIZED_COMMENTS constant from snarfed/granary@6cf8dd9.
this last change, 241a4da, had a big impact. we shouldn't have been fetching replies to posts that just link to the user's site; now we don't. no more i should probably also find someplace to put logic to drop those replies even if they do get fetched, since they shouldn't result in wms. i still want to get rid of the duplicate user fetches though! |
...when fetching multiiple comments or posts from the same author. snarfed/bridgy#1021
added user profile API request caching in snarfed/granary@85ccb5a. pretty happy with reddit now, i'm going to close this. |
Our use of the Reddit API, via PRAW, is pretty badly inefficient right now. Not usually a big deal, but got a lot more noticeable recently due to https://brid.gy/reddit/lgats, which backfeeds a few big sites that get linked and mentioned on Reddit a lot: https://sec.report/, https://uspto.report/, https://lei.report/, https://fccid.io/.
Two obvious problems right now. First, PRAW repeats user lookups a lot. Example poll log:
eg in that poll, https://oauth.reddit.com/user/gst/about/ was fetched five times.
Second, we're fetching every comment in big threads, even in source mf2 handlers for individual comments, which seems excessive. Example log from
/comment/reddit/lgats/m74e3g/gr9rmye
:That poll includes over a dozen requests to
https://oauth.reddit.com/api/morechildren/
, many of which ask for >800 children each.PRAW does some caching, but evidently not enough, or we haven't configured it correctly. Related issues: praw-dev/praw#1140, praw-dev/praw#131. And from praw-dev/praw#627: "Also the client side cache has been removed in PRAW4." (We're on PRAW 7.)
The text was updated successfully, but these errors were encountered: