Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce foreign keys / cascading delete #13143

Closed
PVince81 opened this issue Jan 7, 2015 · 28 comments
Closed

Introduce foreign keys / cascading delete #13143

PVince81 opened this issue Jan 7, 2015 · 28 comments
Assignees

Comments

@PVince81
Copy link
Contributor

PVince81 commented Jan 7, 2015

Currently it is quite difficult to keep the database clean on deletion.
If we had foreign keys we could use the cascading delete feature of database to automatically remove orphaned entries.

Some examples:

Also it would help to maintain a stronger data integrity and avoid issues like having storage entries (the known legacy storage case) that do not match users file cache entries any more, etc.
It would also prevent potential bugs to do partial manipulations in the DB or integrity breaking operations.

Open questions:

  1. Should external apps also define foreign keys that might point to core's tables ? (a core table should not have foreign keys to external tables though)
  2. What was the initial though process for deciding to not use foreign keys in the first place ? (at least so we can have it documented here)

CC @DeepDiver1975 @karlitschek

@PVince81
Copy link
Contributor Author

PVince81 commented Jan 7, 2015

also @icewind1991 @bantu @th3fallen

@DeepDiver1975
Copy link
Member

Data integrity is important and the job of the DBMS - not the application code.

We need to take a closer look on our supported databases with respect if they support FKs with cascade delete. MyISAM comes to my mind which does not support FKs - question is if we can work around this.

@DeepDiver1975
Copy link
Member

  1. Should external apps also define foreign keys that might point to core's tables ? (a core table should not have foreign keys to external tables though)

This is indeed an interesting question - generally speaking we should not allow apps to directly access tables out side the scope of their own app - which results in using FKs should not be allowed as well.

This would kind of declare this topic irrelevant ....

  1. What was the initial though process for deciding to not use foreign keys in the first place ? (at least so we can have it documented here)

Basically because some DBMS don't support FKs

@PVince81
Copy link
Contributor Author

PVince81 commented Jan 8, 2015

I also realized that repair steps might become more complex over time, because they cannot always guess what other tables are pointing to file ids.

For example if an app decides to point any kind of metadata to specific file paths or file ids.
Then we write a repair step that clean file entries. The app's tables will not be auto-cleaned.
But if we allow apps to define foreign keys that point to core tables with cascading delete (to be discussed), then deleting a file entry would delete all associated metadata as well.

Recently we added tags/favorites for files. In theory we should verify ALL repair steps and make sure that all those that delete entries in oc_filecache also delete the matching oc_share and oc_vcategory_to_object entries.

Or the alternative: have three sets of repair steps:

  1. First run the repair steps that delete file cache entries
  2. Then run the repair steps that delete orphaned share entries (the ones orphaned by 1))
  3. Then run other repair steps that delete orphaned tags

If we had foreign keys that could happen automatically.

These are just thoughts that came to mind today.

@DeepDiver1975
Copy link
Member

refs #5868

@xificurk
Copy link

xificurk commented Jan 9, 2015

I think introducing FK into database schema is a good idea, it would solve a lot of headaches with data consistency.
MyISAM FK support can be (with some limitations) simulated by triggers. Imho the usage of MyISAM should be generally discouraged in favor of better db engines (mysql innodb, postgresql).

@PVince81
Copy link
Contributor Author

Another case where cascading delete would have helped: #13535

@PVince81
Copy link
Contributor Author

PVince81 commented Feb 9, 2015

For MySQL we have InnoDB I believe.

SQLite now supports them too: https://www.sqlite.org/foreignkeys.html

Postgres: http://www.postgresql.org/docs/8.3/static/tutorial-fk.html

Oracle: http://www.dba-oracle.com/t_foreign_key_indexing.htm

MSSQL: https://msdn.microsoft.com/en-us/library/ms189049.aspx

I suspect that the reason why OC wasn't designed with foreign keys was because SQlite didn't support them.

There are so many data integrity issues that could have been avoided, some of which we are still fighting with today.

But also enabling foreign keys now will surely reveal additional hidden bugs in the code, which will need to be fixed too. Often the code will assume that there isn't such key. So this isn't just a matter of adding the keys, but also fixing the code.

I suggest to proceed gradually and add foreign keys step by step, then fix the corresponding code (and do a lot of regression testing)

@DeepDiver1975
Copy link
Member

#13820 - might help us to get more control over the migrations in the future ...

@PVince81
Copy link
Contributor Author

Here are a few possible foreign key links we could establish:

Files

  • oc_filecache.storage -> oc_storages.numeric_id
  • oc_storages.owner -> oc_users.uid

Currently the user isn't referenced clearly in oc_storages, requires some restructuring: #11261

Shares

  • oc_share.file_source -> oc_filecache.fileid
  • oc_share.owner -> oc_users.uid
  • oc_share.share_with ?? that field can be either a user or a group

One challenge here is that oc_share is generic and not only for files. We might want to move file-related shares to a separate class. "oc_share" would be for basic share info and completed by joining with a new table "oc_file_share"

Also file_source is "/" + fileid, so this might require trimming.

  • oc_share_external.owner (or user?) -> oc_users.uid
  • Also at some point I suppose we'll have a local directory of remote users, that could be used too.

Tags

  • oc_vcategory_to_object.category+type -> oc_vcategory.category+type
  • oc_vcategory_to_object.objid -> oc_filecache.fileid (only when type="files")

But only if type is "files", might need separate tables too for file-related tags. (unless we want/can have foreign keys that join on both "type=files" and "objectid=fileid")

  • oc_filecache.mimetype -> oc_mimetypes.mimetype
  • Why do we have oc_filecache.mimepart ? Still needed ?

Users

  • oc_preferences.userid -> oc_users.uid
  • oc_privatedata.userid -> oc_users.uid
  • oc_vcategory.uid -> oc_users.uid
  • oc_group_admin.gid -> oc_groups.id
  • oc_group_member.gid -> oc_groups.id
  • oc_group_admin.uid -> oc_users.uid
  • oc_group_member.uid -> oc_users.uid

LDAP

  • oc_ldap_user_mapping.owncloudname -> oc_users.uid ? (@blizzz)
  • TODO: group mapping, etc

Apps

  • oc_preferences.appid -> oc_appconfig.appid
  • oc_privatedata.appid -> oc_appconfig.appid

Misc

  • oc_locks.userid <-> oc_users.uid
  • oc_locks.owner <-> oc_users.uid ?
  • oc_locks.uri ?? That could be a file id instead, pointing at oc_filecache.fileid
  • oc_properties.userid <-> oc_users.uid
  • oc_properties.propertypath ?? Should be file id instead, pointing at oc_filecache.fileid

@blizzz
Copy link
Contributor

blizzz commented Feb 11, 2015

oc_ldap_user_mapping.owncloudname -> oc_users.uid ? (@blizzz)

No, oc_users contains only and solely local ownCloud users.

@blizzz
Copy link
Contributor

blizzz commented Feb 11, 2015

But all the other cases where a uid is involved also may apply to oc_ldap_user_mapping.owncloudname , e.g. oc_preferences.userid

@wshadow
Copy link

wshadow commented Feb 27, 2015

A lot of this has been discussed in #5756:

I think adding FK support and finally using a database instead of a set of tables is long overdue. Just drop support for MyISAM and force those users to switch to InnoDB (maybe even integrate this into the update procedure if possible). I have been proposing this since OC3 and the only reason I remember why we did not do this already is because of @karlitschek s comment: "We have to stay compatible.". Which of course means MyISAM support.

I also think apps should be able to define their own FK and thereby force their tables to be consistent with core tables. I.E. YES to Open Question 1. Maybe force the use of delete+update cascade in this case so that the FK don't prevent queries run by the core.

Ps.: Owncloud 3 was released 31.01.2012 https://owncloud.org/changelog/
sqlite 3.6.19 (which introduced FK) was released 14.10.2009 http://www.sqlite.org/releaselog/3_6_19.html

@PVince81
Copy link
Contributor Author

If I remember well we already migrated users to InnoDB since OC 6 or 7. @icewind1991 is that right ?

@wshadow
Copy link

wshadow commented Feb 27, 2015

Another example of where FKs would be useful: #14476

@PVince81
Copy link
Contributor Author

When introducing foreign keys, it will also require repair code for people migrating from older versions.
The repair code needs to cleanup possible duplicates, orphaned entries, etc, before the foreign keys can be applied.
Which means that such tasks like #12323 #7874 need to be done first.

@wshadow
Copy link

wshadow commented Feb 27, 2015

Looks like #9492 indeed introduced the code to switch users to InnoDB. So we should be fine on that note.

@wshadow
Copy link

wshadow commented Feb 28, 2015

So for cleanup purposes we basically do this [1] on all relevant tables?

[1] http://stackoverflow.com/questions/3384127/delete-sql-rows-where-ids-do-not-have-a-match-in-another-table

@PVince81
Copy link
Contributor Author

PVince81 commented Dec 9, 2016

Is it at least ok to use foreign keys for newly added tables ?

(adding some for older tables is likely to break without proper cleanup of the data)

@PVince81
Copy link
Contributor Author

PVince81 commented Dec 9, 2016

So for cleanup purposes we basically do this [1] on all relevant tables?

Yes... This happens in a background job for things like orphaned shares.

@PVince81
Copy link
Contributor Author

Experimental: #28529

@PVince81
Copy link
Contributor Author

PVince81 commented Jun 7, 2018

@butonic @DeepDiver1975 should we try introducing foreign keys and cascading delete on a case by case basis, considering that some will need migrations and also potentially check existing data in case said data causes contraint violations ?

@patrickjahns
Copy link
Contributor

patrickjahns commented Jun 7, 2018

@PVince81
I'd like to raise a point here - that not having foreign key constraints // cascading deletes - or the need of requiring integrity to be handled on database level, would still potentially allow for sharding ownCloud's database.

Once we introduce the requirements of foreign keys / cascading deletes we will loose the ability to shard the database. Also foreign key constraints will likely be a performance hit on databases that focus on scalability (ref: https://www.cockroachlabs.com/docs/stable/foreign-key.html#performance )

if scalability is a concern - data integrity should be handled on application (ownCloud) side

@PVince81
Copy link
Contributor Author

PVince81 commented Jun 7, 2018

@patrickjahns thanks for your input. This could have been the reason why foreign keys were never added in the first place.

@phil-davis
Copy link
Contributor

In an ideal world there would be a database abstraction layer somewhere that allows the sharding (is that really a word?) "invisibly" while providing "all" the logical database features such as foreign key checks, primary key checks, cascading delete etc. The application layer(s) should be able to sit on top and not think about this stuff.

And such a layer should have minimal performance impact, and I see squadrons of pigs flying overhead.

@PVince81
Copy link
Contributor Author

PVince81 commented Dec 5, 2018

Can't introduce and rely on foreign keys for SQLite, because DBAL doesn't support them: see doctrine/dbal#1204 and doctrine/dbal#2833.

We'd need to get rid of SQLite support first: #29625

@PVince81
Copy link
Contributor Author

we have our first foreign keys in the oc_persistent_locks table and it seems shared hosters don't like it as they remove permissions to reference other tables: #34598 (comment)

This also needs get rid of support for shared hosters...

@micbar
Copy link
Contributor

micbar commented Sep 16, 2021

That would need a major release. The development of the ownCloud platform is happening in ocis.

@micbar micbar closed this as completed Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants