-
-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase SQLite3's cache size #1489
Conversation
So at the moment, the cache is not even big enough to hold the 158531 domains from the default adlist (https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts)? |
Yes... |
f62f0f9
to
d10887d
Compare
Maybe we should even increase it a bit more? 24MiB for a maximum of 750000? |
I don't know, it's open for discussion. This is not a global cache but a per-connection (i.e., per-file) cache. So a connection to FTL's database will consume (up to) the same memory as does the gravity database connection. I also thought about making this configurable rather than hard-code anything but then this seemed too much effort given how small the actual performance advantage might be (noone really knows) |
My only concern is about low memory devices, like Raspberry Pi Zero or even Raspberry Pi 1 and 2. Could this be a problem for any of those devices? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, if this is per file (gravity.db
and pihole-FTL.db
) we should keep it at 16MiB. The lowest RAM amount on a RPi is 256. So we take already 32 MiB from it.
__
I know it's only logged when debug is enabled, but I think this is logged to often. Esp. because the value does not change during runtime.
[2022-11-17 19:26:33.503 21146/T21162] FTL database cache size is 16.0 MiB
[2022-11-17 19:26:33.504 21146/T21162] Found database host name (same address) 10.0.1.26 -> Google-Pixel-4a.fritz.box
[2022-11-17 19:26:33.504 21146/T21162] Closing FTL database in getNameFromIP() (/__w/FTL/FTL/src/database/network-table.c:2074)
[2022-11-17 19:26:33.505 21146/T21162] Opening FTL database in getIfaceFromIP() (/__w/FTL/FTL/src/database/network-table.c:2153)
[2022-11-17 19:26:33.508 21146/T21162] FTL database cache size is 16.0 MiB
[2022-11-17 19:26:33.508 21146/T21162] getDatabaseHostname(): "SELECT interface FROM network JOIN network_addresses ON network_addresses.network_id = network.id WHERE network_addresses.ip = ? AND interface != 'N/A' AND interface IS NOT NULL;" with ? = "10.0.1.26"
[2022-11-17 19:26:33.509 21146/T21162] Found database interface 10.0.1.26 -> eth0
[2022-11-17 19:26:33.509 21146/T21162] Closing FTL database in getIfaceFromIP() (/__w/FTL/FTL/src/database/network-table.c:2223)
[2022-11-17 19:26:33.511 21146/T21162] Opening FTL database in getMACfromIP() (/__w/FTL/FTL/src/database/network-table.c:1834)
[2022-11-17 19:26:33.514 21146/T21162] FTL database cache size is 16.0 MiB
[2022-11-17 19:26:33.515 21146/T21162] Found database hardware address 10.0.1.57 -> 24:29:34:91:56:42
[2022-11-17 19:26:33.515 21146/T21162] Closing FTL database in getMACfromIP() (/__w/FTL/FTL/src/database/network-table.c:1904)
[2022-11-17 19:26:33.516 21146/T21162] Opening FTL database in getNameFromIP() (/__w/FTL/FTL/src/database/network-table.c:2011)
[2022-11-17 19:26:33.519 21146/T21162] FTL database cache size is 16.0 MiB
[2022-11-17 19:26:33.519 21146/T21162] Found database host name (same address) 10.0.1.57 -> Pixel-6a.fritz.box
[2022-11-17 19:26:33.519 21146/T21162] Closing FTL database in getNameFromIP() (/__w/FTL/FTL/src/database/network-table.c:2074)
[2022-11-17 19:26:33.520 21146/T21162] Opening FTL database in getIfaceFromIP() (/__w/FTL/FTL/src/database/network-table.c:2153)
[2022-11-17 19:26:33.523 21146/T21162] FTL database cache size is 16.0 MiB
[2022-11-17 19:26:33.523 21146/T21162] getDatabaseHostname(): "SELECT interface FROM network JOIN network_addresses ON network_addresses.network_id = network.id WHERE network_addresses.ip = ? AND interface != 'N/A' AND interface IS NOT NULL;" with ? = "10.0.1.57"
[2022-11-17 19:26:33.524 21146/T21162] Found database interface 10.0.1.57 -> eth0
[2022-11-17 19:26:33.524 21146/T21162] Closing FTL database in getIfaceFromIP() (/__w/FTL/FTL/src/database/network-table.c:2223)
[2022-11-17 19:26:33.527 21146/T21162] Opening FTL database in getMACfromIP() (/__w/FTL/FTL/src/database/network-table.c:1834)
[2022-11-17 19:26:33.529 21146/T21162] FTL database cache size is 16.0 MiB
As it is per-connection, we log it once for every opened connection. Any suggestions? |
Only log it for the first database connection (per-file). E.g. during start-up somewhere along
and
|
The only reason I see to log many times is if this value can vary for each connection during runtime. |
Signed-off-by: DL6ER <[email protected]>
d10887d
to
fe4077e
Compare
This pull request has been mentioned on Pi-hole Userspace. There might be relevant details there: https://discourse.pi-hole.net/t/pi-hole-ftl-v5-20-and-web-v5-18-released/59959/1 |
What does this implement/fix?
Increase default SQLite3 cache size from ~2 MiB to 16 MiB
FTL heavily relies on the B-tree of the databases to be able to make extremely fast decisions if a domain is to be blocked. The speed-up between a B-tree and a conventional lookup is on the order of
O(N/log(N))
so something like 7,000x for 120,000 gravity-blocked domains or roughly 100,000x for two million domains.For this to be efficient, FTL needs to have sufficiently fast access to the B-tree. By default, SQLite3 uses an upper limit for the cache of roughly 2 MiB. This PR increases this limit to 16 MiB. Given how much memory FTL uses overall, this seems an acceptable amount.
One further technical detail here: One domain takes roughly 32 bytes of memory in the index (it's a bit more because of bookkeeping things). So the default cache is only sufficient for about 60,000 domains. Increasing this to 16 MiB will make enough room for at least 480,000 domains.
You may ask "but can FTL be as fast with a cache is too small at the moment?" and this is indeed a very good question. Even when the cache is too small to keep the entire tree in memory, the kernel sees that you're reading from the file
gravity.db
really really often and decides to cache the file for you. It's the kernel's way to make intelligent use of "free" memory. So - in a way - the table already completely resides in memory just now.So what is this PR going to change? Well, when the entire file is kept in the kernel's cache, we still have to interact with the database file and read the file and do all the interactions using I/O whereas, when the cache is sufficiently large, SQLite3 will be able to cache the tree itself and does not need to read the file ever and ever again.
What's the benefit? It's really hard to measure. It can surely be seen as a kind of micro-optimization but it something that basically comes for free and allows us to notably reduce I/O that has to go through the kernel and, hence, reduces syscalls and overall latency.
Related issue or feature (if applicable): N/A
Pull request in docs with documentation (if applicable): N/A
By submitting this pull request, I confirm the following:
git rebase
)Checklist: