Documentation of JSON config loads extremely slow (> 10s) #178

SimonBackx · 2021-06-30T13:04:08Z

When you go to the documentation on the website, and go to 'JSON config structure', every page loads very slow. I already have this problem for over a month, I thought it would be a temporary issue, so I never reported it.

Example:
Visit https://caddyserver.com/docs/json/apps/http/ in any browser

I have tried on MacOS 11.4, on Safari 14.1.1 or latest Chrome / Firefox.

If you check the network tab in the developer tools, you can see that the following request is blocked for almost 10 seconds:
https://caddyserver.com/api/docs/config/apps/http/

So this is probably a server / config related issue.

francislavoie · 2021-06-30T13:07:29Z

This is a known issue.

There was a major rewrite that was done recently to the backend which collects the module documentation, but there were some performance oversights.

@mholt is busy now with other work, but will get back to this at some point.

mholt · 2021-06-30T13:50:27Z

On the bright side, the caching I removed fixed a lot of problems.

douglasg14b · 2021-08-19T21:33:45Z

I would suggest a loading indicator. Long loads happen, and may take some time to resolve. But the UX side can reduce user frustration if there is some indication that it's loading as opposed to broken or missing.

This message top-center makes it seem like I went to the wrong place :)

mholt · 2021-08-19T22:09:12Z

Yeah, I agree. I haven't had time to get around to a loading indicator, but the website is open source (obviously), so anyone is welcome to contribute one. I'm just a bit picky, it has to look good and work correctly, and not be too complicated.

polarathene · 2021-08-21T23:59:59Z

I also had bad UX when looking at JSON docs a few times in past week or so. Initially as someone who hadn't seen the docs I had assumed that I was viewing undocumented/empty page, and the image above also made me wonder if I was looking in the wrong place.

It was only by chance when I left one of the doc pages open when I went to get a coffee that I came back and realized it was just extremely slow at loading the docs content..

You don't need any fancy loading indicator, just something like "Loading docs.." or similar text message placeholder that is replaced. At least then the user knows that there is actual content not quite there and something may be broken with loading them.

Is the retrieved content not static?

I'm curious what change on the server resulted in this behavior (I assume this problematic behaviour isn't in the open source part of the website?). For me it is a 15 second TTFB (via browser or curl).

There is no client caching headers in the response either? (a short duration surely wouldn't hurt?)

francislavoie · 2021-08-22T00:39:59Z

Is the retrieved content not static?

No, it's the result of some complex SQL queries. The database is built up from some currently closed-source tooling that @mholt wrote to collect information on Caddy modules. When the rewrite was done, there were some performance oversights.

Again, it's a time issue at this point, in more ways than one. This will get fixed, when Matt has time to dedicate to that tooling again.

polarathene · 2021-08-22T00:51:48Z

No, it's the result of some complex SQL queries.

Those are being run for every request to the same resource? No server-side cache? The caddy module cache-handler or Souin should both be able to support caching the JSON responses as a quick/easy workaround no?

Setting at least Cache-Control header or similar for the JSON responses so that clients can cache the results on page reloads would also be quick and easy addition. Right now it's 15 seconds to visit the same doc page content regardless of when I last visited it, even if that was immediately after the content loaded.

Again, it's a time issue at this point, in more ways than one. This will get fixed, when Matt has time to dedicate to that tooling again.

I understand this, hence the two approaches suggested above. These could sit in the current Caddyfile of this repo perhaps? At the very least the cache-control header if xcaddy for server-side caching module is not desirable.

francislavoie · 2021-08-22T00:54:05Z

No server-side cache?

Currently no. Because time.

Please have patience.

We're very well aware of the issue and the possible mitigations, but Matt's a very busy guy and he's not had time to spend on this.

mholt · 2021-08-22T01:06:13Z

@polarathene

I'm curious what change on the server resulted in this behavior (I assume this problematic behaviour isn't in the open source part of the website?).

Implemented strongly versioned docs, making the queries more computational (this also fixed some bugs)
Removed caching (also fixed some bugs)

These can be optimized but it's not a huge priority right now. Could bump it up if some significant sponsors or enterprise customers need it, but haven't heard anything from them. I'm mostly busy preparing my presentation for the API Platform Conference in a couple weeks.

polarathene · 2021-08-22T02:59:32Z

Currently no. Because time.

The project Caddyfile could be modified to include one of the mentioned cache modules. I could create a PR for that, but I'm not familiar with your deployment process and if adding xcaddy to include the module is going to be the blocker on your end.

I don't see this as a costly time fix, especially if someone can provide a PR for it. The actual problem with the SQL queries further down can always be addressed at another time if appropriate.

If you're interested in this approach, I'm happy to contribute some time towards it.

Removed caching (also fixed some bugs)

website/Caddyfile

Line 17 in a7861e9

reverse_proxy /api/* localhost:4444

Can't we just place header Cache-Control "max-age=3600" or similar here? I've tried to apply that only to JSON responses via handle_response but ran into a bug I think.

I know it's not server caching, so it won't help with first page visits, but at least subsequent visits (for as long as cache duration is set) it will provide a better UX to those revisiting pages.

max-age=300, stale-while-revalidate=604800 will allow for lower max-age if you prefer and let the browser use that previous cache entry for 1 week while refreshing (if longer than 5 minutes since prior request) the cache response in the background for the next visit. That'd be preferable in the meantime.

Would you like a PR for that?

I'm mostly busy preparing my presentation for the API Platform Conference in a couple weeks.

No problem, no rush :)

xiruizhao · 2021-10-11T06:17:03Z

A website that loads more than 5s should be considered offline. This should be a priority right now. An outdated documentation is much better than no documentation.

mohammed90 · 2021-10-11T06:51:20Z

A website that loads more than 5s should be considered offline. This should be a priority right now. An outdated documentation is much better than no documentation.

We're doing our best within our capacity. We're a limited team with only so much time and energy to spare. The focus has been on the main repo in addition to backoffice portal for module devs.

Contributions are welcome if you're willing to take a shot at it!

polarathene · 2021-10-11T09:03:50Z

TL;DR: Not sure how we're meant to take a shot at it, when current proposals have not received actionable responses, and the issue itself AFAIK is closed source?

Contributions are welcome if you're willing to take a shot at it!

I thought the issue was due to an API request to a backend that queries a DB, which we don't have access to? How are we meant to take a shot at it?

I already provided a request/discussion regarding client cache headers so that we can at least only suffer the initial delay. The other suggestion was to add a caddy HTTP cache module via xcaddy. Again no response on that.

I mean, I guess we could implement a crawler for all the pages to collect the JSON responses and run that on a scheduled job to keep refreshing whenever changes have happened, eg run each day. Then static JSON can be served when the backend takes too long to respond.

I don't see that approach being accepted by the maintainers here though. It'd also be presumably easier on their end to just query for each page at deployment time, and again do some recurring refresh job. I honestly don't see the docs being that high frequency of a change that this wouldn't be a valid approach.

Better yet, have the backend store the response (in the DB if preferable), and send that response for the relevant query received, then make the DB query that's currently done (presumably redundantly most of the time), and update the response that will be served the next time it's queried. Again fast responses, with the only drawback being the first person to query that data when it's updated gets stale data.

mholt · 2021-10-11T20:48:03Z

I have been working on this, but I'm still stuck trying to export the production DB to my local machine. Recent changes in Cockroach removed the dump command, and the BACKUP SQL statement dumps a whole directory onto the server's file system (instead of where the client is running). Sigh. I've been spending hours on it today already.

So please wait while I migrate to a new DB with proper export capabilities.

mholt · 2021-10-13T03:44:16Z

I finally got a copy of the database set up with Postgres locally and took some base timings for loading JSON docs. Today I got the loading time down from 4.9s to 2.9s by adding some indices to a table. Easy win; but most of the day was spent trying to figure out how to set up Postgres and migrate the data over.

There's more queries being made than need to be, so I'll try to tackle those next.

Won't be touching the production DB until the optimizations are complete though.

mholt · 2021-10-13T05:22:44Z

Got the average worst case down to < 900ms by eliminating unnecessary queries during recursion (i.e. amortizing). I also rewrote some queries that used subqueries to instead use SELECT DISTINCT ON with an ORDER BY clause.

(As for "worst case": the case gets worse the deeper into the JSON structure you traverse. There is a limit but it's kinda arbitrary so it's hard to say what the actual "worst case" is.) Most page loads happen for me in < 200ms. (Starting load times for same page was 4.9s.)

Of course this will be slower on the production server which doesn't have the specs my workstation does. But the speed improvements should be rather drastic anyway. I'm guessing about 75% faster in the "worst case" and about 96% faster on average.

There is room for more improvement, but I think these are good enough to push once I get the production DB migrated over to Postgres. I hope the speed changes carry over to the production system.

In summary;

Switched to Postgres
Added more indices
Rewrote some queries, simplifying them
Amortized some recursive code

Probably introduced some new bugs while I was at it, but at a glance I didn't notice anything glaring.

douglasg14b · 2021-10-13T05:58:26Z

Wow! Good work!

These are some quick wins that could be achieved, augmenting your success, without slogging too much more through SQL:

Loading indicator
- Effort: Easy AF
Cache header
- Effort: Easy AF
Server-side Request Caching
- Effort: Easy AF -> Easy (Depending on framework)
Server Side Data Caching
- Effort: Easy AF -> Medium (Depending on framework)
Generate static HTML/JSON for the docs
- Effort: Medium -> Hard (Depending on framework)
Materialized View (Probably a dumb suggestion)?
Postgres Query Cache?

Using just 3 || 4 means that response times could be a few ms across the board for docs without much cost aside from a small memory footprint. Augment that with cache headers, and some 304 Not Modified results, and even the cache will get bored :p. Unfortunately I don't know much about the hosting setup, at least in asp.net core caching this sort of stuff is just a couple lines, it might be more difficult with w/e lang & framework is in use here.

The jist is that this appears to be fairly static data that's being generated on the fly at significant cost, why not cache it?

mholt · 2021-10-13T15:43:13Z

Yeah, good ideas @douglasg14b .

Yep, this should be easy, I will see if I can whip one up. This is also something the community can contribute if they have better front-end skillz.
We should probably do this, I'll experiment with setting this up.

3-5 not really interested in doing; at least not now.

would just slow down the UI more. I've never interacted with a Material UI that was faster than a "normal" UI.
Maybe, I don't know about this, I'm only on day 3 of using Postgres so I'd have to learn about this.

douglasg14b · 2021-10-13T16:43:20Z

To be clear, number 6 is a materialized view in postgres. Not a change in site CSS, which wouldn't have any bearing on this problem.

Number 3 || 4 would be best tackled by someone familiar with the API environment, but would probably provide the biggest improvement. I'm on my phone so this is kind of difficult to ascertain, but do you know what lang/framework is used for the backend?

Congrats on making such progress on day 3 of postgres 😮

mholt · 2021-10-13T19:07:07Z

Oh, I see, I hadn't heard of materialized views before.

I've finished migrating the production DB to Postgres and have deployed the optimizations in the code. I'm observing about a 98% speedup over the worst 10s (or higher) load time offenders. For example, this loads in about 400ms for me: https://caddyserver.com/docs/json/apps/http/servers/routes/handle/reverse_proxy/

I'll see about adding a Cache-Control header next, as that's a really easy win as well. (Update: done)

Still room for improvement, but for now that will have to do, so I can work on other things. The community is welcome to contribute a loading indicator.

francislavoie added the bug Something isn't working label Jun 30, 2021

francislavoie mentioned this issue Aug 19, 2021

Docs page for JSON Structure is broken caddyserver/caddy#4293

Closed

mholt closed this as completed Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation of JSON config loads extremely slow (> 10s) #178

Documentation of JSON config loads extremely slow (> 10s) #178

SimonBackx commented Jun 30, 2021

francislavoie commented Jun 30, 2021

mholt commented Jun 30, 2021

douglasg14b commented Aug 19, 2021

mholt commented Aug 19, 2021

polarathene commented Aug 21, 2021

francislavoie commented Aug 22, 2021

polarathene commented Aug 22, 2021

francislavoie commented Aug 22, 2021 •

edited

Loading

mholt commented Aug 22, 2021

polarathene commented Aug 22, 2021 •

edited

Loading

xiruizhao commented Oct 11, 2021

mohammed90 commented Oct 11, 2021

polarathene commented Oct 11, 2021 •

edited

Loading

mholt commented Oct 11, 2021 •

edited

Loading

mholt commented Oct 13, 2021

mholt commented Oct 13, 2021 •

edited

Loading

douglasg14b commented Oct 13, 2021 •

edited

Loading

mholt commented Oct 13, 2021

douglasg14b commented Oct 13, 2021 •

edited

Loading

mholt commented Oct 13, 2021 •

edited

Loading

Documentation of JSON config loads extremely slow (> 10s) #178

Documentation of JSON config loads extremely slow (> 10s) #178

Comments

SimonBackx commented Jun 30, 2021

francislavoie commented Jun 30, 2021

mholt commented Jun 30, 2021

douglasg14b commented Aug 19, 2021

mholt commented Aug 19, 2021

polarathene commented Aug 21, 2021

francislavoie commented Aug 22, 2021

polarathene commented Aug 22, 2021

francislavoie commented Aug 22, 2021 • edited Loading

mholt commented Aug 22, 2021

polarathene commented Aug 22, 2021 • edited Loading

xiruizhao commented Oct 11, 2021

mohammed90 commented Oct 11, 2021

polarathene commented Oct 11, 2021 • edited Loading

mholt commented Oct 11, 2021 • edited Loading

mholt commented Oct 13, 2021

mholt commented Oct 13, 2021 • edited Loading

douglasg14b commented Oct 13, 2021 • edited Loading

mholt commented Oct 13, 2021

douglasg14b commented Oct 13, 2021 • edited Loading

mholt commented Oct 13, 2021 • edited Loading

francislavoie commented Aug 22, 2021 •

edited

Loading

polarathene commented Aug 22, 2021 •

edited

Loading

polarathene commented Oct 11, 2021 •

edited

Loading

mholt commented Oct 11, 2021 •

edited

Loading

mholt commented Oct 13, 2021 •

edited

Loading

douglasg14b commented Oct 13, 2021 •

edited

Loading

douglasg14b commented Oct 13, 2021 •

edited

Loading

mholt commented Oct 13, 2021 •

edited

Loading