Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CalDAV/CardDAV: put every method from backends that does multiple DB calls in transactions #36528

Merged
merged 2 commits into from
Apr 17, 2023

Conversation

tcitworld
Copy link
Member

@tcitworld tcitworld commented Feb 3, 2023

In a lot of methods we're doing read-after-writes (for instance calling updateProperties after touching calendar objects). There's also a lot of deleting methods that do stuff sequentially which could cause trouble.
This should avoid this kind of issues.

TODO

  • Fix tests

Checklist

@tcitworld tcitworld added bug 3. to review Waiting for reviews feature: caldav Related to CalDAV internals labels Feb 3, 2023
@tcitworld tcitworld added this to the Nextcloud 26 milestone Feb 3, 2023
@tcitworld tcitworld modified the milestones: Nextcloud 26, Nextcloud 27 Feb 3, 2023
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Psalm found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

@tcitworld tcitworld marked this pull request as ready for review February 10, 2023 11:18
@tcitworld tcitworld requested review from a team, ArtificialOwl, icewind1991 and blizzz and removed request for a team February 10, 2023 13:17
@tcitworld
Copy link
Member Author

Psalm issues are existing ones.

Copy link
Member

@ChristophWurst ChristophWurst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best enjoyed as https://github.com/nextcloud/server/pull/36528/files?w=1

👍 Good stuff. This might not even fix nasty database issues but improve performance a bit if changes are committed in one transaction.

@tcitworld tcitworld changed the title refactor(dav): put every method from OCA\DAV\CalDAV\CalDAVBackend that does multiple DB calls in transactions CalDAV/CardDAV: put every method from backends that does multiple DB calls in transactions Feb 10, 2023
@tcitworld
Copy link
Member Author

tcitworld commented Feb 10, 2023

I added an extra commit 27e717c to use the QueryBuilder properly in the addChange methods.

@tcitworld tcitworld force-pushed the dav-backend-transations branch 2 times, most recently from 27e717c to 3f8b383 Compare February 13, 2023 08:34
Copy link
Contributor

@miaulalala miaulalala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omg-reaction-gif-8

@szaimen
Copy link
Contributor

szaimen commented Mar 16, 2023

@tcitworld would you mind fixing the conflicts? Then we should be able to merge this. I can also take over the rebase if you want. Thanks a lot! :)

@szaimen szaimen added 2. developing Work in progress and removed 3. to review Waiting for reviews labels Mar 16, 2023
@szaimen
Copy link
Contributor

szaimen commented Mar 20, 2023

They are shown as new since the affected code is now into callbacks.

so update the baseline then in order to fix psalm?

@tcitworld
Copy link
Member Author

It was a specific change in PHPDoc that didn't match the inherited class, my bad. Upstream PR at sabre-io/dav#1457

@ChristophWurst ChristophWurst added 4. to release Ready to be released and/or waiting for tests to finish and removed 3. to review Waiting for reviews labels Mar 28, 2023
@szaimen
Copy link
Contributor

szaimen commented Apr 17, 2023

/rebase

@szaimen szaimen enabled auto-merge April 17, 2023 14:52
…does multiple DB calls in transactions

In a lot of methods we're doing read-after-writes (for instance calling
updateProperties after touching calendar objects).
There's also a lot of deleting methods that do stuff sequentially which
could cause trouble.
This should avoid this kind of issues.

Signed-off-by: Thomas Citharel <[email protected]>
…rementing synctoken

Now that we're in a transaction, we can reuse the sync token's previous value without trouble, and rewrite the increment UPDATE query without dirty direct SQL.

Signed-off-by: Thomas Citharel <[email protected]>
@szaimen szaimen merged commit 66ab45b into master Apr 17, 2023
@szaimen szaimen deleted the dav-backend-transations branch April 17, 2023 22:55
@tcitworld
Copy link
Member Author

Goodbye rebase issues :)

@nickvergessen
Copy link
Member

Goodbye rebase issues :)

Hello DB issues :(

Transaction commit failed because the transaction has been marked for rollback only.

We have quite some log entries like this in the meantime.
Stacktrace section:

      {
        "file": "…/3rdparty/sabre/dav/lib/CalDAV/Schedule/Plugin.php",
        "line": 538,
        "function": "put",
        "class": "Sabre\\CalDAV\\CalendarObject",
        "type": "->",
        "args": [
          "*** sensitive parameters replaced ***"
        ]
      },
      {
        "file": "…/apps/dav/lib/CalDAV/Schedule/Plugin.php",
        "line": 179,
        "function": "scheduleLocalDelivery",
        "class": "Sabre\\CalDAV\\Schedule\\Plugin",
        "type": "->",
        "args": [
          [
            "Sabre\\VObject\\ITip\\Message",
            "92c36c40-48fe-483b-9cff-…",
            "VEVENT",
            "REQUEST",
            1,
            "mailto:b…[email protected]",
            "*** sensitive parameters replaced ***",
            "mailto:s…[email protected]",
            "*** sensitive parameters replaced ***",
            "*** sensitive parameters replaced ***",
            [
              "Sabre\\VObject\\Component\\VCalendar",
              "*** sensitive parameters replaced ***",
              "VCALENDAR"
            ],
            false
          ]
        ]
      },
      {
        "file": "…/3rdparty/sabre/event/lib/WildcardEmitterTrait.php",
        "line": 89,
        "function": "scheduleLocalDelivery",
        "class": "OCA\\DAV\\CalDAV\\Schedule\\Plugin",
        "type": "->",
        "args": [
          [
            "Sabre\\VObject\\ITip\\Message",
            "92c36c40-48fe-483b-9cff-…",
            "VEVENT",
            "REQUEST",
            1,
            "mailto:b…[email protected]",
            "*** sensitive parameters replaced ***",
            "mailto:s…[email protected]",
            "*** sensitive parameters replaced ***",
            "*** sensitive parameters replaced ***",
            [
              "Sabre\\VObject\\Component\\VCalendar",
              "*** sensitive parameters replaced ***",
              "VCALENDAR"
            ],
            false
          ]
        ]
      },

@tcitworld
Copy link
Member Author

Interesting. Sounds like we need savepoints to make sure inner transactions don't explode outer ones.
https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/transactions.html#emulated-transaction-nesting-with-savepoints

Another suspicious issue might be #38902 (comment)

tcitworld added a commit that referenced this pull request Jul 27, 2023
Using nested transactions without savepoints is actually deprecated by Doctrine:
https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/transactions.html#transaction-nesting

Without savepoints, a nested transaction can be rollbacked but not
handled properly in the "real" transaction, leading to the following
error:
Transaction commit failed because the transaction has been marked for rollback only.

Ref
#36528 (comment)
(and possibly) #38902 (comment)

Signed-off-by: Thomas Citharel <[email protected]>
tcitworld added a commit that referenced this pull request Oct 6, 2023
Using nested transactions without savepoints is actually deprecated by Doctrine:
https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/transactions.html#transaction-nesting

Without savepoints, a nested transaction can be rollbacked but not
handled properly in the "real" transaction, leading to the following
error:
Transaction commit failed because the transaction has been marked for rollback only.

Ref
#36528 (comment)
(and possibly) #38902 (comment)

Signed-off-by: Thomas Citharel <[email protected]>
tcitworld added a commit that referenced this pull request Feb 1, 2024
Using nested transactions without savepoints is actually deprecated by Doctrine:
https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/transactions.html#transaction-nesting

Without savepoints, a nested transaction can be rollbacked but not
handled properly in the "real" transaction, leading to the following
error:
Transaction commit failed because the transaction has been marked for rollback only.

Ref
#36528 (comment)
(and possibly) #38902 (comment)

Signed-off-by: Thomas Citharel <[email protected]>
AndyScherzinger pushed a commit that referenced this pull request Feb 27, 2024
Using nested transactions without savepoints is actually deprecated by Doctrine:
https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/transactions.html#transaction-nesting

Without savepoints, a nested transaction can be rollbacked but not
handled properly in the "real" transaction, leading to the following
error:
Transaction commit failed because the transaction has been marked for rollback only.

Ref
#36528 (comment)
(and possibly) #38902 (comment)

Signed-off-by: Thomas Citharel <[email protected]>
@DimanNe
Copy link

DimanNe commented Apr 14, 2024

@tcitworld Do you remember why you used non retry-able atomic()?

In AppFramework/Db/TTransactional.php there is atomicRetry. Would not it make more sense to use that?

  • What if PG/MySQL on a different host and there is a (temporary) network issue/disconnect?
  • What if a user uses SQLite, which limits concurrency? <--- this is actually my case.

The reason I am asking is that tasks & calendars in nextcloud are essentially broken for me: almost 10-30% of events/tasks creation and/or moving fail for me in any bulk task/calendar operation (such as calendar import or move tasks from one task list to another) with the following error:

Import partially failed General error: 5 database is locked

message":"An exception occurred while executing a query: SQLSTATE[HY000]: 
General error: 5 database is locked","exception":{},"CustomMessage":"An exception occurred while executing a query:
SQLSTATE[HY000]: General error: 5 database is locked"

In the logs I have this backtrace


I think retry-able atomic has a flaw/issue in implementation: any retry is supposed to have some backoff policy (like truncated exponential backoff), but atomicRetry does not have one.


P.S.

I managed to fix my error by replacing atomic() with the following code:

protected function atomic(callable $fn, IDBConnection $db) {
    $maxRetries = 10;
    $attempts = 0;

    while (true) {
        $db->beginTransaction();
        try {
            $result = $fn();
            $db->commit();
            return $result;
        } catch (Throwable $e) {
            $db->rollBack();
            $attempts++;
            error_log("Attempt $attempts of $maxRetries failed. Exception: " . $e->getMessage() . " in " . $e->getFile() . " on line " . $e->getLine());

            if ($attempts >= $maxRetries) {
                throw $e;
            }

            $sleepMs = rand(1, 1000);
            usleep($sleepMs * 1000);
        }
    }
}

@tcitworld
Copy link
Member Author

Do you remember why you used non retry-able atomic()?

In AppFramework/Db/TTransactional.php there is atomicRetry. Would not it make more sense to use that?

I have not looked into it, but this method simply didn't exist at the time (#38030 was merged two months after this one was).

@DimanNe
Copy link

DimanNe commented Apr 14, 2024

@DimanNe
Copy link

DimanNe commented Apr 14, 2024

@tcitworld I see... Thank you for quick response.

Can you suggest what would the best way to fix the root cause of the issue (which is lack of retries with backoff policy) be?

Shall I create a new / separate issue?

@nickvergessen
Copy link
Member

Well a retry at this place won't help if the problem is network issues, as all the other queries including the initial appconfig load will not be retried as well.
If you have concurrency issues with sqlite the fix is to use a proper DB instead.
It's even mentioned on https://docs.nextcloud.com/server/latest/admin_manual/installation/system_requirements.html and other sections even skip sqlite from the supported lists....

@DimanNe
Copy link

DimanNe commented Apr 14, 2024

@nickvergessen Thank you. Good points. I did not know about SQLite being discouraged.
(In this case, I guess, there is no point in creating any issue(s))

susnux added a commit that referenced this pull request May 22, 2024
Partial backport of #36528

Signed-off-by: Ferdinand Thiessen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4. to release Ready to be released and/or waiting for tests to finish bug feature: caldav Related to CalDAV internals
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants