Add DB indexes to queried tables #410

thiagosanches · 2020-08-10T16:29:44Z

Hi everyone, good afternoon.

We are using lighthouse-ci and sqlite to save the data. We have a bunch of websites that we've collect the data, but today I started experiencing some slowness, it's impossible to get the data from the /statistics path. I took a look at the database and we have ~23.000 rows on the statistics table. It seems that there is no indexes on it as well. However, not sure if you guys had a huge amount of data before.

Thanks,
Regards.

The text was updated successfully, but these errors were encountered:

patrickhulce · 2020-08-10T19:52:29Z

Thanks for filing @thiagosanches good catch! You're right we never went back to add indexes to the stats table after we added it to the dashboard flow and it came out of experimentation. There are some easy wins there 👍

Even at 23k rows I wouldn't expect there to be much of a problem for sqlite though. Our canary server is running sqlite on average hardware, has ~90k rows in the statistics table, and can handle ~10 concurrent requests with <100ms p90 response time.

Perhaps there's something else afoot in your situation? What specifically is being slow?

thiagosanches · 2020-08-11T14:04:09Z

Hey @patrickhulce thanks for sharing your thoughts. I agree with you and I believe that sqlite should handle well that amount of data.

I got the lhci.db database to test the lhci server on my local instance, to remove any external item that might be affecting the response time, we are using the application deployed on a AWS fargate.

But even locally, it was taking to long to get the data:

Would you mind to share with me the command that you've used, so I can test on my side as well?
Thank you.

patrickhulce · 2020-08-11T14:30:05Z

Is this the first time those statistics are being requested after a version update or is the database being reset somehow?

When a build is uploaded, the statistics are computed and inserted into the statistics table for future fast retrieval. When there's an LHCI version change that requires updating the statistics though (or if something were deleting the statistics rows), all statistics rows are invalidated and they will be recomputed lazily when requested (which can take a very long time), but future requests should be very fast.

Those response times are indicative to me that the statistics are being recomputed.

thiagosanches · 2020-08-11T22:34:09Z

Hi @patrickhulce sorry for the delay...

I have forced the update of the docker image on AWS Fargate and now I'm using the 0.4.3 version, the data (lhci.db) is on a volume and is being preserved so there is no reset on its data. But when I try to access the dashboard it's taking more time to load it, if when I try for the second time:

patrickhulce · 2020-08-12T16:05:32Z

Hm, so I guess everything is just dramatically slower for your instance if the recomputation time is >1.5 minutes and the cached time is ~20 seconds.

I'm not sure what could be causing that. We haven't really seen anything close to that slow for cached fetching of statistics.

I suppose we can add the indexes (which we should do anyway) and see if it goes away, but I'm not sure what the true root cause is there.

thiagosanches · 2020-08-13T12:08:00Z

Hi @patrickhulce I was able to manage in a different way. I was using AWS fargate and it seems that when the application started the metrics calculation the container stopped suddenly, so the calculation has never finished properly. I moved to an AWS EC2 instance and now the metrics are able to be calculated.

It's taking ~11 seconds (the /statistics path on dashboard window), but we have more than 60 websites being analyzed, I believe that it's a huge number.

However, adding an index in the future maybe help to decrease that number!

Thank you for your helping!
Regards.

patrickhulce · 2020-08-18T21:26:14Z

Adding indexes to our testing server improved siege stats ~80% so I hope it improves your situation too @thiagosanches

thiagosanches · 2020-08-18T22:17:51Z

Wow, that's pretty cool @patrickhulce thank you for your effort on that!
As soon as you release it I'll update on my side!

patrickhulce added the needs-more-info label Aug 10, 2020

patrickhulce changed the title ~~Has anyone experienced some slowness on ligthhouse to fetch data?~~ Add DB indexes to queried tables Aug 12, 2020

patrickhulce added P1 and removed needs-more-info labels Aug 12, 2020

patrickhulce self-assigned this Aug 12, 2020

patrickhulce mentioned this issue Aug 18, 2020

feat(server): add indexes on common queries #417

Merged

patrickhulce closed this as completed in #417 Aug 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add DB indexes to queried tables #410

Add DB indexes to queried tables #410

thiagosanches commented Aug 10, 2020

patrickhulce commented Aug 10, 2020

thiagosanches commented Aug 11, 2020

patrickhulce commented Aug 11, 2020

thiagosanches commented Aug 11, 2020

patrickhulce commented Aug 12, 2020

thiagosanches commented Aug 13, 2020 •

edited

Loading

patrickhulce commented Aug 18, 2020

thiagosanches commented Aug 18, 2020

Add DB indexes to queried tables #410

Add DB indexes to queried tables #410

Comments

thiagosanches commented Aug 10, 2020

patrickhulce commented Aug 10, 2020

thiagosanches commented Aug 11, 2020

patrickhulce commented Aug 11, 2020

thiagosanches commented Aug 11, 2020

patrickhulce commented Aug 12, 2020

thiagosanches commented Aug 13, 2020 • edited Loading

patrickhulce commented Aug 18, 2020

thiagosanches commented Aug 18, 2020

thiagosanches commented Aug 13, 2020 •

edited

Loading