-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation of JSON config loads extremely slow (> 10s) #178
Comments
This is a known issue. There was a major rewrite that was done recently to the backend which collects the module documentation, but there were some performance oversights. @mholt is busy now with other work, but will get back to this at some point. |
On the bright side, the caching I removed fixed a lot of problems. |
Yeah, I agree. I haven't had time to get around to a loading indicator, but the website is open source (obviously), so anyone is welcome to contribute one. I'm just a bit picky, it has to look good and work correctly, and not be too complicated. |
I also had bad UX when looking at JSON docs a few times in past week or so. Initially as someone who hadn't seen the docs I had assumed that I was viewing undocumented/empty page, and the image above also made me wonder if I was looking in the wrong place. It was only by chance when I left one of the doc pages open when I went to get a coffee that I came back and realized it was just extremely slow at loading the docs content.. You don't need any fancy loading indicator, just something like "Loading docs.." or similar text message placeholder that is replaced. At least then the user knows that there is actual content not quite there and something may be broken with loading them. Is the retrieved content not static? I'm curious what change on the server resulted in this behavior (I assume this problematic behaviour isn't in the open source part of the website?). For me it is a 15 second TTFB (via browser or curl). There is no client caching headers in the response either? (a short duration surely wouldn't hurt?) |
No, it's the result of some complex SQL queries. The database is built up from some currently closed-source tooling that @mholt wrote to collect information on Caddy modules. When the rewrite was done, there were some performance oversights. Again, it's a time issue at this point, in more ways than one. This will get fixed, when Matt has time to dedicate to that tooling again. |
Those are being run for every request to the same resource? No server-side cache? The caddy module Setting at least
I understand this, hence the two approaches suggested above. These could sit in the current Caddyfile of this repo perhaps? At the very least the cache-control header if |
Currently no. Because time. Please have patience. We're very well aware of the issue and the possible mitigations, but Matt's a very busy guy and he's not had time to spend on this. |
These can be optimized but it's not a huge priority right now. Could bump it up if some significant sponsors or enterprise customers need it, but haven't heard anything from them. I'm mostly busy preparing my presentation for the API Platform Conference in a couple weeks. |
The project Caddyfile could be modified to include one of the mentioned cache modules. I could create a PR for that, but I'm not familiar with your deployment process and if adding I don't see this as a costly time fix, especially if someone can provide a PR for it. The actual problem with the SQL queries further down can always be addressed at another time if appropriate. If you're interested in this approach, I'm happy to contribute some time towards it.
Line 17 in a7861e9
Can't we just place I know it's not server caching, so it won't help with first page visits, but at least subsequent visits (for as long as cache duration is set) it will provide a better UX to those revisiting pages.
Would you like a PR for that?
No problem, no rush :) |
A website that loads more than 5s should be considered offline. This should be a priority right now. An outdated documentation is much better than no documentation. |
We're doing our best within our capacity. We're a limited team with only so much time and energy to spare. The focus has been on the main repo in addition to backoffice portal for module devs. Contributions are welcome if you're willing to take a shot at it! |
TL;DR: Not sure how we're meant to take a shot at it, when current proposals have not received actionable responses, and the issue itself AFAIK is closed source?
I thought the issue was due to an API request to a backend that queries a DB, which we don't have access to? How are we meant to take a shot at it? I already provided a request/discussion regarding client cache headers so that we can at least only suffer the initial delay. The other suggestion was to add a caddy HTTP cache module via I mean, I guess we could implement a crawler for all the pages to collect the JSON responses and run that on a scheduled job to keep refreshing whenever changes have happened, eg run each day. Then static JSON can be served when the backend takes too long to respond. I don't see that approach being accepted by the maintainers here though. It'd also be presumably easier on their end to just query for each page at deployment time, and again do some recurring refresh job. I honestly don't see the docs being that high frequency of a change that this wouldn't be a valid approach. Better yet, have the backend store the response (in the DB if preferable), and send that response for the relevant query received, then make the DB query that's currently done (presumably redundantly most of the time), and update the response that will be served the next time it's queried. Again fast responses, with the only drawback being the first person to query that data when it's updated gets stale data. |
I have been working on this, but I'm still stuck trying to export the production DB to my local machine. Recent changes in Cockroach removed the So please wait while I migrate to a new DB with proper export capabilities. |
I finally got a copy of the database set up with Postgres locally and took some base timings for loading JSON docs. Today I got the loading time down from 4.9s to 2.9s by adding some indices to a table. Easy win; but most of the day was spent trying to figure out how to set up Postgres and migrate the data over. There's more queries being made than need to be, so I'll try to tackle those next. Won't be touching the production DB until the optimizations are complete though. |
Got the average worst case down to < 900ms by eliminating unnecessary queries during recursion (i.e. amortizing). I also rewrote some queries that used subqueries to instead use (As for "worst case": the case gets worse the deeper into the JSON structure you traverse. There is a limit but it's kinda arbitrary so it's hard to say what the actual "worst case" is.) Most page loads happen for me in < 200ms. (Starting load times for same page was 4.9s.) Of course this will be slower on the production server which doesn't have the specs my workstation does. But the speed improvements should be rather drastic anyway. I'm guessing about 75% faster in the "worst case" and about 96% faster on average. There is room for more improvement, but I think these are good enough to push once I get the production DB migrated over to Postgres. I hope the speed changes carry over to the production system. In summary;
Probably introduced some new bugs while I was at it, but at a glance I didn't notice anything glaring. |
Wow! Good work! These are some quick wins that could be achieved, augmenting your success, without slogging too much more through SQL:
Using just 3 || 4 means that response times could be a few ms across the board for docs without much cost aside from a small memory footprint. Augment that with cache headers, and some The jist is that this appears to be fairly static data that's being generated on the fly at significant cost, why not cache it? |
Yeah, good ideas @douglasg14b .
3-5 not really interested in doing; at least not now.
|
To be clear, number 6 is a materialized view in postgres. Not a change in site CSS, which wouldn't have any bearing on this problem. Number 3 || 4 would be best tackled by someone familiar with the API environment, but would probably provide the biggest improvement. I'm on my phone so this is kind of difficult to ascertain, but do you know what lang/framework is used for the backend? Congrats on making such progress on day 3 of postgres 😮 |
Oh, I see, I hadn't heard of materialized views before. I've finished migrating the production DB to Postgres and have deployed the optimizations in the code. I'm observing about a 98% speedup over the worst 10s (or higher) load time offenders. For example, this loads in about 400ms for me: https://caddyserver.com/docs/json/apps/http/servers/routes/handle/reverse_proxy/ I'll see about adding a Cache-Control header next, as that's a really easy win as well. (Update: done) Still room for improvement, but for now that will have to do, so I can work on other things. The community is welcome to contribute a loading indicator. |
When you go to the documentation on the website, and go to 'JSON config structure', every page loads very slow. I already have this problem for over a month, I thought it would be a temporary issue, so I never reported it.
Example:
Visit https://caddyserver.com/docs/json/apps/http/ in any browser
I have tried on MacOS 11.4, on Safari 14.1.1 or latest Chrome / Firefox.
If you check the network tab in the developer tools, you can see that the following request is blocked for almost 10 seconds:
https://caddyserver.com/api/docs/config/apps/http/
So this is probably a server / config related issue.
The text was updated successfully, but these errors were encountered: