Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop counting at 10,000 on table view as well #2408

Closed
simonw opened this issue Aug 21, 2024 · 7 comments
Closed

Stop counting at 10,000 on table view as well #2408

simonw opened this issue Aug 21, 2024 · 7 comments

Comments

@simonw
Copy link
Owner

simonw commented Aug 21, 2024

Datasette now stops counting at 10,000 rows (actually 10,001 so it can tell if there are more than 10,000 or 10,000 exactly) every time it displays a list of tables:

This feels so much less sluggish! As such, I'm thinking about applying the same trick to the row count at the top of the table page itself:

CleanShot 2024-08-21 at 15 27 27@2x

@simonw
Copy link
Owner Author

simonw commented Aug 21, 2024

The big problem with this is that asking "how many rows, exactly are in this table" is a completely legitimate question!

But I'm beginning to think Datasette should have a policy that we NEVER scan more than 10,000 rows. Those kind of scans are the root of all of our performance issues against big databases.

There may be a third option: cache the table count so we can show that number, but implement the 10,000 limit for any time we run a filtered query against that page. That way we can still show the total number of rows but we don't get stuck running expensive counts the moment someone applies a large filter.

@simonw
Copy link
Owner Author

simonw commented Aug 21, 2024

Related: sqlite-utils has a mechanism for implementing a cached _counts database table that's updated using triggers: https://sqlite-utils.datasette.io/en/stable/python-api.html#python-api-cached-table-counts

I built this with Datasette in mind but never completely integrated it. Integrating that now would give us accurate table counts for databases that enabled that feature.

@simonw
Copy link
Owner Author

simonw commented Aug 21, 2024

It could look like this:

CleanShot 2024-08-21 at 15 31 30@2x

@simonw
Copy link
Owner Author

simonw commented Aug 21, 2024

One more option: show that 10,000 but offer a button the user can click to calculate that full count:

CleanShot 2024-08-21 at 15 33 53@2x

It could link to /db/-/query?sql=select+count(*)...

Should this work even if they don't have execute-sql permission? Maybe not - I'd be OK with users without that permission being unable to trigger an expensive count via some other mechanism.

Could also have progressive enhancement JavaScript that runs that query via fetch() and updates the number on the page.

@simonw
Copy link
Owner Author

simonw commented Aug 21, 2024

Decision: for the moment I'm going to implement that 10,000 cap along with an enhanced "calculate full total" button which is only visible to execute-sql permission users and uses the API to run the full count.

Any caching tricks will happen in another issue.

@simonw
Copy link
Owner Author

simonw commented Aug 22, 2024

This is pretty neat:

count

@simonw
Copy link
Owner Author

simonw commented Aug 22, 2024

Needs to look a bit better when part of a longer "... where filters" display unit, also needs to handle errors better.

I got it to work like this:

count3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant