Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fed shares scanner batched cronjob #37391

Merged
merged 1 commit into from
May 19, 2020

Conversation

mrow4a
Copy link
Contributor

@mrow4a mrow4a commented May 14, 2020

Problem

This PR is solving the issues caused by PollIncommingShares command where simultaneous browsing in UI, syncing with sync client and issuing poll-command could cause stale filecache entries - https://github.com/owncloud/enterprise/issues/3902

Fed shares scanner

ScanExternalShares is a background job used to run the external shares
scanner over external shares that are eligible for scanning,
to ensure integrity of the file cache. Scanner will search for external shares
that satisfy the below requirements:

  • ensure that within single cron run, at max [cronjob_scan_external_batch]
    scans will be performed out of all accepted external shares
  • scan of that external share has not been performed within
    last [cronjob_scan_external_min_scan] seconds
  • user still exists, and has been active recently, meaning logged in at
    least [cronjob_scan_external_min_login] seconds ago
  • external share root etag/mtime changed, signaling that remote changed
    and requires scan

How tested

  • unit tests
  • manual with occ background:queue:execute --force --accept-warning 12

How to enable

  • command line
# required
occ config:app:set files_sharing cronjob_scan_external_enabled --value yes

# optionaly 
occ config:app:set files_sharing cronjob_scan_external_min_login --value 14400
occ config:app:set files_sharing cronjob_scan_external_min_scan --value 3600
occ config:app:set files_sharing cronjob_scan_external_batch --value 100
  • web ui

image

Notes

@micbar Target 10.5
@micbar might need documentation update in https://doc.owncloud.com/server/admin_manual/configuration/files/federated_cloud_sharing_configuration.html

@mrow4a mrow4a requested a review from VicDeo May 14, 2020 21:43
@mrow4a mrow4a self-assigned this May 14, 2020
@update-docs
Copy link

update-docs bot commented May 14, 2020

Thanks for opening this pull request! The maintainers of this repository would appreciate it if you would create a changelog item based on your changes.

@mmattel
Copy link
Contributor

mmattel commented May 15, 2020

Pls run: make test-php-style-fix to fix coding style error(s)

@mrow4a mrow4a force-pushed the feature/fed-shares-cronjob branch from d2a5355 to 8c1c3df Compare May 17, 2020 10:59
@mrow4a mrow4a force-pushed the feature/fed-shares-cronjob branch 2 times, most recently from 4663fb8 to 35affe6 Compare May 17, 2020 13:20
@mrow4a mrow4a marked this pull request as ready for review May 17, 2020 13:24
@mrow4a mrow4a requested a review from VicDeo May 17, 2020 13:25
@mrow4a
Copy link
Contributor Author

mrow4a commented May 17, 2020

@micbar do you want to fit this in 10.5?

@mrow4a mrow4a force-pushed the feature/fed-shares-cronjob branch from 35affe6 to a5d85b0 Compare May 17, 2020 15:26
@mrow4a mrow4a requested a review from phil-davis May 17, 2020 15:27
@micbar micbar requested a review from DeepDiver1975 May 18, 2020 08:09
@micbar micbar added the p2-high Escalation, on top of current planning, release blocker label May 18, 2020
@mrow4a mrow4a force-pushed the feature/fed-shares-cronjob branch from a5d85b0 to ae924f1 Compare May 18, 2020 11:49
@mrow4a mrow4a force-pushed the feature/fed-shares-cronjob branch from ae924f1 to b7858c1 Compare May 18, 2020 11:54
@mrow4a mrow4a requested a review from VicDeo May 18, 2020 11:55
@DeepDiver1975 DeepDiver1975 merged commit bb488fd into master May 19, 2020
@delete-merged-branch delete-merged-branch bot deleted the feature/fed-shares-cronjob branch May 19, 2020 11:55
@mrow4a
Copy link
Contributor Author

mrow4a commented Jun 6, 2020

@pmaier1 @cdamken

Migrations when upgrading from 10.4 + potential impact on upgrade duration (small/mid/high)

MID - we are adding a new column to mid-sized table, but we do not update/alter the values

Replaces former background job? If yes, is it still available or not?

It replaces former occ command that was scheduled by admins via crontab. It will be still avaiable (depreciated), but is discouraged. If admin enables checkbox in admin settings, he SHOULD remove depreciated job from crontab

Is the checkbox in admin settings the same as occ command ?

Yes, this is the same config.

Does it need to be added to crontab?

No. This is owncloud cronjob that does not require setting crontab entry, as soon as admin enables checkbox, cronjob will go in effect in intervals of 10min.

Docs?

Seems some work already started (owncloud/docs#2648), but generally:

  1. Fed share scan cronjob is disabled by default, to enable use occ or checkbox:
occ config:app:set files_sharing cronjob_scan_external_enabled --value yes

  1. To customize minimum amount of hours when last login of user happened so scan is triggered (ensures only active users get fed shares synced) - 24h
occ config:app:set files_sharing cronjob_scan_external_min_login --value <integer-seconds>
  1. To customize minimum amount of hours when last scanned so next scan is triggered (avoid frequent scan when active collaboration) - 3h
occ config:app:set files_sharing cronjob_scan_external_min_scan --value <integer-seconds>
  1. To customize maximum amount of fed share checks per 10 minutes - unlimited
currently not limited, amounts to numbers of fed shares in the system
  1. To customize maximum amount of fed share scans per 10 minutes (scan performed only if fed share files got updated) - 100
occ config:app:set files_sharing cronjob_scan_external_batch --value <integer-number>

@pmaier1
Copy link
Contributor

pmaier1 commented Jun 8, 2020

Thanks @mrow4a.

  • Regarding the list (1-5): Is the last value in the line (e.g., 24h) the default value?
  • Regarding 3.: Does this mean that a fed share that was scanned within the last e.g., 3h will not be scanned again until the configured period has passed?
  • Regarding 4./5.: What's the difference between fed share check and fed share scan? Check only discovers changes and scan actually updates the meta data? If yes, this is also an improvement compared to the previous command as that one only did the discovery, right?

@mrow4a
Copy link
Contributor Author

mrow4a commented Jun 9, 2020

@pmaier1

  • Regarding the list (1-5): Is the last value in the line (e.g., 24h) the default value?

Yes, this is default value currently configured

  • Regarding 3.: Does this mean that a fed share that was scanned within the last e.g., 3h will not be scanned again until the configured period has passed?

yes, fed share can be scanned only once per 3h (by default).

  • Regarding 4./5.: What's the difference between fed share check and fed share scan? Check only discovers changes and scan actually updates the meta data? If yes, this is also an improvement compared to the previous command as that one only did the discovery, right?

check -> operation of checking if fed share should be scanned (last user login and last fed shares scan time)
scan -> only fed shares for which user logged-in in last 24h and last scan of that share happened at least 3h ago are scanned (scan is operation of checking storage root for change, and if changed syncing filecache of remote with local). Currently, we scan only in batches of 100 fed shares per 10 minutes, remembering the offset at which it finished last app.files_sharing.cronjob_scan_external_offset

The difference with previous job is that previously it "polled" all the fed shares with each command execution. Poll was adjusting root etag of the folder (helping discovery on client). The main problem with "polling" was that in some corner cases it could case issues with sync/web client change discovery (ref owncloud/enterprise#3902). Now instead of "adjusting root etag from remote to local", we "sync whole tree from remote to local", which is much more expensive operation, but we have no other choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature:federated-cloud-sharing p2-high Escalation, on top of current planning, release blocker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants