Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadline Exceeded 504 On Mass Edit #465

Closed
clive-h-townsend opened this issue May 31, 2020 · 5 comments
Closed

Deadline Exceeded 504 On Mass Edit #465

clive-h-townsend opened this issue May 31, 2020 · 5 comments

Comments

@clive-h-townsend
Copy link

clive-h-townsend commented May 31, 2020

  • Operating System version: Linux Mint 19
  • Firebase SDK version: 4.3.0
  • Library version: Unsure (Latest?)
  • Firebase Product: firestore

Step 3: Describe the problem
I have a large set of users with a given set of data. I download the user data, repackage it and repost it. Pretty straight forward. However, I can only run about 600 users at a time before I get the following:

Traceback (most recent call last):
  File "/home/clive/repositories/refquest-pro/plus/python_files/PermissionsTranslator/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 96, in next
    return six.next(self._wrapped)
  File "/home/clive/repositories/refquest-pro/plus/python_files/PermissionsTranslator/venv/lib/python3.6/site-packages/grpc/_channel.py", line 416, in __next__
    return self._next()
  File "/home/clive/repositories/refquest-pro/plus/python_files/PermissionsTranslator/venv/lib/python3.6/site-packages/grpc/_channel.py", line 689, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1590900128.216480777","description":"Error received from peer ipv4:216.58.192.202:443","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "translate.py", line 49, in <module>
    for user in usersList:
  File "/home/clive/repositories/refquest-pro/plus/python_files/PermissionsTranslator/venv/lib/python3.6/site-packages/google/cloud/firestore_v1/query.py", line 775, in stream
    for response in response_iterator:
  File "/home/clive/repositories/refquest-pro/plus/python_files/PermissionsTranslator/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 99, in next
    six.raise_from(exceptions.from_grpc_error(exc), exc)
  File "<string>", line 3, in raise_from
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded

Steps to reproduce:

What happened? How can we make the problem occur?
This could be a description, log/console output, etc.

Relevant Code:


import json

import firebase_admin
from firebase_admin import credentials, firestore


cred = credentials.Certificate('../serviceKey.json')
firebase_admin.initialize_app(cred)

db = firestore.client()

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# Build a dictionary object of all master data

# Process through every account
accountsRef = db.collection(u'accounts').stream()

# Create the final account object
accountObject = {}

# For every account in the portfolio
for account in accountsRef:

    # Get a reference for all of the conferences in the account
    conferenceRef = db.collection(u'accounts').document(account.id).collection(u'conferences').stream()

    # The array of conferences
    conferenceList = []

    # For every conference in the list
    for conference in conferenceRef:

        # add it to the array
        conferenceList.append(conference.id)

    # next add the compiled array to the account Object
    accountObject[account.id] = conferenceList

# # # # # # # # Error In here somewhere # # # # # 
# Using an order_by and start_after here to restart after the timeout error kills the job
usersList = db.collection(u'users').order_by(u'email_address').start_after({u'email_address': u'[email protected]'}).stream()

i=1
for user in usersList:

    # For a given user
    userRef = db.collection(u'users').document(user.id)
    userData = userRef.get().to_dict()
    # userData = user.to_dict()

    # Check they have a membership
    if ('membership' in userData):

        accountsList = userData ['membership']

        permV2 = {}

        for refAccount in accountsList:

            flatConsortium = {}
            
            for confKey in userData['permissions']:

                # Block tthis consortium because its exempt
                if (refAccount != 'kIM78AbSsscMHVgzGosk'):

                    if confKey in accountObject[refAccount]:

                        flatConsortium[confKey] = True

            permV2[refAccount] = flatConsortium
        
        userRef.update({u'permissionsV2': permV2})

    # print('{}'.format(permV2))
    print(str(i) + ': ' + userData['email_address'])
    i = i + 1




@google-oss-bot
Copy link

I found a few problems with this issue:

  • I couldn't figure out how to label this issue, so I've labeled it for a human to triage. Hang tight.
  • This issue does not seem to follow the issue template. Make sure you provide all the required information.

@hiranya911
Copy link
Contributor

Note that Firestore client is actually in the repo https://github.com/googleapis/python-firestore

You're likely to get a faster response if you report this there.

@hiranya911
Copy link
Contributor

@crwilcox @BenWhitehead fya

@clive-h-townsend
Copy link
Author

@hiranya911 Thanks for your comment. I guess I thought this was the best place to put the comment but if there is somewhere I should move it to, please let me know!

@wilhuff
Copy link

wilhuff commented Jun 7, 2020

Since you're updating users individually it seems unlikely that the individual updates are hitting a deadline exceeded. More likely is that streaming the usersList is what's timing out.

In streaming Google APIs, including Firestore, the RPC deadline applies to the entire stream, not the next result. This means with the default deadline of 60 seconds, you must process the all the stream results within that span.

There are a few things you can do to address this:

  • Process results faster, e.g. by collecting all the results usersList into a list separately from processing them. I wouldn't recommended this simple strategy if your list is large though.
  • You could also extend the deadline with a configuration similar to the one described in another issue, though in this case you'd use RunQuery as the RPC you're configuring, not Listen. You can't make the RPC deadline infinite though (I believe the max is less than 10 minutes) so this strategy will only buy you ~5-10x the results you're handling now.
  • Read results in a loop with a limited number of documents per query.

This last strategy will extend indefinitely (though you can combine it with the others). The idea is that you run the query with a limit and use the last document in the query to construct the start_after for the next query. This strategy will let you read all documents in a collection, no matter the collection size.

@wilhuff wilhuff closed this as completed Jun 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants