-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
db on-disk space usage increased v1.58+ #12975
Comments
I think you've basically found the right conclusions.
I'm not sure whether this table could be periodically cleaned up; I've asked — I'm not sure if that would mean that all users in the room need to sync past that point to receive the update, or whether it's just serving as a faster cache for data that could be found out otherwise. I've asked. As for whether you are 'holding it wrong' so to speak: I don't think you're doing anything too offensive, but it's not working out well compared to the optimisations put in place for more 'typical' usage patterns. From
I think based on what that's saying that you can configure your script with a device ID and use the same one each time. If you log in with the same device ID, it sounds like it should re-use that device (and therefore it shouldn't produce a change).
(BTW if you don't specify a device ID, the server will generate one for you, so you don't need to worry about that part yourself.)
Login/outs count as changes, yes. I think there may be some other reasons they can change, usually to do with encryption.
Maybe. The admin API can let you get an access token for a user without creating an associated device, I think: https://matrix-org.github.io/synapse/latest/admin_api/user_admin_api.html#login-as-a-user The admin API has some other functionality which may or may not do some of what you want, so if you weren't aware of it, it may be worth a look, but I don't think that makes this issue less valid. |
Thanks for the quick and thorough reply!
I had tried this a while back by making a pool of device ids for my async workers to share. Pool of device ids because each new login to a device id would invalidate any concurrent session. This had more edges than I anticipated and I scraped it for a simpler implementation. But I didn't think to re-use the 'session' (device id + access token) everywhere. The only concern with reusing the session is dealing with a
Thanks for pointing out 'login as user'. I had forgotten that it existed. Will give that a try if sharing a session doesn't work. I've had other problems with the admin api but it might be setup related (random 404s). |
I've created #13043 to track clearing out that table periodically. I don't have any great ideas of how else to make this better for you really, so I'm going to close this for now. Shout if you there's anything else. |
Description
Since upgrading past v1.58 size of database on disk has ballooned for my usage. Specifically the
device_lists_changes_in_room
table.Steps to reproduce
I suspect the problem is my specific usage of synapse: I use an automated script that creates rooms and manages adding/removing users from those rooms as needed. The scripts runs async and in parallel across hosts+processes.
The script logs in as a single administrative user (marked as 'admin' and in every room it creates) with a random device ID (to avoid collisions) and logs out after every use. This user is a member of each and every room on the system. In each environment I deploy there are 100k+ rooms.
As I understand #12321, a record of a user's rooms is kept after each 'change'. if a change were to affect my administrative user its membership in every room is recorded. I assume a 'change' is a login/logout?
Tactical question: since I'm not federating, can I safely purge rows
device_lists_changes_in_room
or is there some short-term, destructive approach I can use to avoid running out of disk space?Higher level question: is the bug my usage? Is there a better way to accomplish what I want with synapse?
Version information
Version: 1.60.1
Install method: https://github.com/spantaleev/matrix-docker-ansible-deploy
The text was updated successfully, but these errors were encountered: