-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Firestore client does not retry on contention errors #827
Comments
@mad-it Thanks for filing this issue. We currently retry transactions only when the commit fails. Our assumption is that any error that occurs within a transaction block is thrown by the user with the intent to abort the transaction. As shown in your logs, that logic is a little bit oversimplified. We will update the client to ignore errors during document reads. Please bear with us though as we are approaching the holidays and will not be able to address this beforehand. |
This will be part of the 3.3 release. |
@schmidt-sebastian Thanks for the fix, it looks like its actually retrying now! 🎉 It seems that there is still an issue tho. Ran a trace once again to verify what is happening and it seems that the transaction is retried with the exponential backoff introduced in #847 but we are getting contention on re-initializing the transaction. Added some sanitized logs. Logs
|
@mad-it What is the write rate for the document in question? It is a bit surprising to me that the contention issue exists for so long. We may to investigate with the backend team why your transactions are failing even with retry. The actual failure that you see is expected though. We only retry five times, after which we will forward the last error to the API surface. |
@schmidt-sebastian the document in question is only written to by 1 or 2 instances at most. But the issue does not seem to be the document which is being written to, it seems to be that Firestore cannot acquire a lock( assumption here) to start a transaction. As you can see in the log I purposely left a log line stating "Transaction started" which is the first thing that happened in the so called "updateFunction". As you can see in the logs, that line is only printed once meaning it never retries the provided "updateFunction" because it fails on reinitializing the transaction scope. |
Do you mind filing a customer support request that contains your project ID, the document path and approximate timestamps? https://cloud.google.com/support/ I am not able to look at the backend logs, but our support team can on request. |
We have received all necessary information. |
Any new information come to light regarding this issue? |
Our backend team took a look at the request logs. It sounds like we didn't really solve the problem yet. What they are seeing is the following sequence of events:
This re-reading is done because we changed the configuration. The retries are now happening, but hidden from our SDK and in the networking layer. Instead, we need to modify the SDK to do these retries ourselves so we can rollback the previous transaction. This would then look as such:
Please bear with us as we make the necessary changes. |
Ran a small test to see if this was fixed. I am not sure its even retrying anymore. here are some sanitized logs Logs
|
After some more investigations, it seems to be working. 1 thing I did notice is that retries do not get a backoff(not sure if this is intentional or not). Since I cannot reopen the issue, ccing @thebrianchen to increase visibility. Added some sanatized logs to showcase that in some instances it does work 🎉 👍 Logs
|
Sorry for the lack of responses. We have had a holiday weekend here and are still digging out from our emails. I will re-open this and we will try to find a fix as soon as we can. |
Current behavior
We are using Google cloud functions in combination with Firestore. When multiple, in this example 2, cloud function instances attempt to process different data on the same documents within a transaction 1 of the instances throws a contention error. This is inline with what is documented here.
Expected behavior
Instance A & B both read document X inside of a transaction. Instance A updates document X before instance B can commit its changes. The expected behavior would be for the transaction on instance B to be retried because it cannot commit its changes due to the X being a "dirty read".
Environment details
@google-cloud/firestore
version: 2.6.1Steps to reproduce
Even though we dont have clear reproduction steps, but after some investigation of the code, the culprit is seems clear.
Looking at this line in the code, it looks like contention caused inside of the
updateFunction
for example caused by reading a document, will never trigger a retry.Additional information
Below is a log output of 2 functions "colliding" and no retry being triggered.
The logs have been sanitized by removing all project references, document structures and data being sent over the wire.
Logs
The text was updated successfully, but these errors were encountered: