Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin kombu version #4075

Merged
merged 1 commit into from
Aug 16, 2019
Merged

Pin kombu version #4075

merged 1 commit into from
Aug 16, 2019

Conversation

arikfr
Copy link
Member

@arikfr arikfr commented Aug 16, 2019

What type of PR is this? (check all applicable)

  • Bug Fix

Description

kombu is a dependency of Celery and usually we let them declare the version, but their version is pinned and the most recent release (4.6.4) has a severe regression where workers stop responding to inspect commands.

kombu is a dependency of Celery and usually we let them declare the version, but their version is pinned and the most recent release (4.6.4) has a severe regression where workers stop responding to inspect commands.
@arikfr arikfr merged commit e5e926b into master Aug 16, 2019
@arikfr arikfr deleted the pin-kombu branch August 16, 2019 16:27
@jezdez
Copy link
Member

jezdez commented Aug 19, 2019

This is concerning to say the least. I'm assuming the issue upstream was actually caused by an oversight during the code review (possibly due to lack of active maintainers) but it shouldn't have been merged without review, too. Thanks for the rollback, this would have caused tons of problems on our side (we planned to ship v8beta today).

@arikfr
Copy link
Member Author

arikfr commented Aug 19, 2019

Thanks for the rollback, this would have caused tons of problems on our side (we planned to ship v8beta today).

It caused us lots of issues already, so everybody gets a more stable version 😅

I'm assuming the issue upstream was actually caused by an oversight during the code review (possibly due to lack of active maintainers) but it shouldn't have been merged without review, too.

It doesn't look like there was any code review. They found the issue -> reverted offending code base -> one of the maintainers wasn't happy that a root cause wasn't found and the revert was done without further investigation -> another maintainers goes and reverts the revert without applying another fix.

Being a maintainer myself (and doing my share of mistakes), I'm really trying not to judge, but as you said this is very concerning. Internally we been discussing again moving to Rq and I think this is probably the final straw.

@jezdez
Copy link
Member

jezdez commented Aug 19, 2019

Thanks for the rollback, this would have caused tons of problems on our side (we planned to ship v8beta today).

It caused us lots of issues already, so everybody gets a more stable version 😅

I'm sorry to hear that, thank you for biting the bullet with running the beta in prod already. I don't think we'd have such a degree in confidence in Redash otherwise.

I'm assuming the issue upstream was actually caused by an oversight during the code review (possibly due to lack of active maintainers) but it shouldn't have been merged without review, too.

It doesn't look like there was any code review. They found the issue -> reverted offending code base -> one of the maintainers wasn't happy that a root cause wasn't found and the revert was done without further investigation -> another maintainers goes and reverts the revert without applying another fix.

Being a maintainer myself (and doing my share of mistakes), I'm really trying not to judge, but as you said this is very concerning. Internally we been discussing again moving to Rq and I think this is probably the final straw.

Yeah, it's definitely an awkward mistake and I'll try to talk to the committer to get a bit more details how this happened. My guess is it's an honest mistake by an overwhelmed maintainer and a hard-to-test issue.

Besides that and the technical aspects of switching to RQ (e.g. less code complexity) I'm careful about assuming that switching to a different queue system removes the maintenance issue outright. But it would certainly be a much smaller surface where mistake such as that could happen. Plus, I'm the maintainer of the Flask-RQ2 package 😉

@arikfr
Copy link
Member Author

arikfr commented Aug 19, 2019

I'm careful about assuming that switching to a different queue system removes the maintenance issue outright.

I'm sure they won't, but what I hope to gain by the switch:

  1. A library where Redis isn't a second class citizen.
  2. Simpler code base, so when needed we can dig in and contribute.

Plus, I'm the maintainer of the Flask-RQ2 package

💪

@jezdez
Copy link
Member

jezdez commented Aug 19, 2019

I'm careful about assuming that switching to a different queue system removes the maintenance issue outright.

I'm sure they won't, but what I hope to gain by the switch:

1. A library where Redis isn't a second class citizen.

2. Simpler code base, so when needed we can dig in and contribute.

Yes, both are absolutely the case in my experience. One thing to note is that selwin, the current RQ maintainer and starter of the RQ GitHub org, has been very open to get help from other contributors. Which is a lot easier than learning the ins and outs of the Celery stack. Maybe something that would allow Redash to "own more of its stack"?

Of course there are also things that aren't possible with RQ, e.g. the advanced workflow techniques that Celery provides via its API: https://docs.celeryproject.org/en/latest/userguide/canvas.html

The good news is that most of RQs architecture can be overwritten with custom implementations (e.,g. queuing, jobs, workers, schedulers) based on a class-based system. There are limits to it, but I think it's a lot more flexible for custom behavior than Celery.

There are also a bunch of minor community projects on top of RQ that may be interesting for us.

harveyrendell pushed a commit to pushpay/redash that referenced this pull request Nov 14, 2019
kombu is a dependency of Celery and usually we let them declare the version, but their version is pinned and the most recent release (4.6.4) has a severe regression where workers stop responding to inspect commands.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants