-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce number of parallel data retrieving tasks #33
base: master
Are you sure you want to change the base?
Conversation
raise Reject(sys.exc_info()[1], requeue=False) | ||
|
||
# Create a lock so we don't try to run the same task multiple times | ||
sdat = date.strftime('%Y-%m-%d') if date else 'ALL' | ||
lock_id = '{0}-lock-{1}-{2}-{3}'.format(__name__, fitbit_user, _type, sdat) | ||
cats = '-'.join('%s' % i for i in categories) | ||
lock_id = '{0}-lock-{1}-{2}-{3}'.format(__name__, fitbit_user, cats, sdat) | ||
if not cache.add(lock_id, 'true', LOCK_EXPIRE): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brad I don't think we can use the Django cache for the lock and guarantee that it will work with the various setups that people are likely to have. For example, Django's default caching method is local memory caching, which is a per-process cache. Depending on celery setup, this code can be executed by more than one process which would each have their own cache and not be able to see the locks created by the other workers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grokcode What can I do then? Would it be safe to get rid of this lock and decorate get_fitbit_data
with @transaction.atomic()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brad I think the easiest solution is to punt on it for now and make a note in the README here that the fitbit tasks shouldn't be run concurrently, and then give an example of a way to set up celery to do that. I think we can use celery's manual routing feature to create a new queue and then when starting celery, make sure there is only one thread working on that queue.
It would be much nicer to support concurrent tasks (but trickier too). I think we can do the locking with the db. One idea is to store the lock in the DB, use the @transaction.atomic()
decorator like you said, and Django's select_for_update to acquire the lock. I think it would be enough to have one lock per user so that only tasks for one user can execute at a time. That way we shouldn't have multiple processes trying to renew the token at the same time and stepping on each other.
@orcasgit/orcas-developers Please review. This greatly reduces the number of tasks we have running simultaneously, make it far less likely for conflicts resulting in bad refresh tokens.