Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redis-py doesn't play well with multiprocessing #496

Closed
namoopsoo opened this issue Jun 17, 2014 · 7 comments
Closed

redis-py doesn't play well with multiprocessing #496

namoopsoo opened this issue Jun 17, 2014 · 7 comments

Comments

@namoopsoo
Copy link

I have been struggling with several python redis commands raising various ResponseError exceptions or returning illegal values. In all cases experienced so far, the nature is intermittent and I have been using wrapper code to help capture some of the behavior , at https://gist.github.com/namoopsoo/b9f082e9eac025f7ec3b .

I have some initial symptoms documented here, on http://stackoverflow.com/questions/24168124/python-redis-responseerror-protocol-error-unbalanced-quotes-in-request .

As mentioned at the above, running redis server 2.8.10 from Homebrew on Mac OS X, and

(venvstage)$ pip freeze | grep redis
hiredis==0.1.3
redis==2.10.1

I do not have code available that someone could run to easily reproduce these errors, since I am just experiencing this in my personal code. But, if needed, I can try to provide more detail.

I have not ruled out that maybe my code is causing the crashes, but I wanted to post this up in case it rang a bell for someone else.

Another bit of output captured is in https://gist.github.com/namoopsoo/fea6157ad4197cf25f24 .

@andymccurdy
Copy link
Contributor

It sounds like you're receiving responses out of order or that two separate contexts are somehow using the same underlying socket and/or buffer. I'd like to get some more information.

  • Did you start seeing these problems only after upgrading to redis-py 2.10.x? Have you tried running your app with an older version of redis-py, such as 2.9.x?
  • Is your app using gevent, eventlent, or threads for concurrency? If so, do the greenlets/threads use a single, shared StrictRedis instance?
  • What encoding/charset are you using? In your output above, it seems like you're not specifying one, which means you're using the default, utf-8. Are you specifying one in your application?

My gut instinct is that this is something to do with the "unbalanced quotes in request" error you mentioned on the stackoverflow post. That error comes from the Redis server and means that the client said it was sending X arguments to the command but the server didn't see that many when it tried to parse them. In that case, it seems possible that some arguments might stay in the server's socket buffer and the next time that socket sends a request, those arguments are also used which could cause some of the behavior you're seeing.

That said, Redis nor redis-py shouldn't have any restrictions on the contents of key names. In fact there are unit tests in redis-py that make sure it can use arbitrary binary data as a keyname just fine. Is there a command you can execute with a given keyname/arguments that consistently raises the "unbalanced quotes" error? I think that will be the key to tracking this down.

@namoopsoo
Copy link
Author

Thanks for the reply Andy. I just did a comparison between redis-py 2.9.0 and 2.10.1 . I am seeing basically the same results when using both versions.

Now as for concurrency, I just might be doing something which is not supported and now I can't wait to change things around to see what happens.

I have a complex system with a few daemons running, but here is a slight simplification of what happens in one server.py,

# server.py

from myprocess import MyProcess

rdb_cnxn = redis.ConnectionPool(host='localhost', port=6379, db=1)
rdb = redis.StrictRedis(connection_pool=rdb_cnxn)

rdb2_cnxn = redis.ConnectionPool(host='localhost', port=6379, db=2)
rdb2 = redis.StrictRedis(connection_pool=rdb2_cnxn)

workers = [MyProcess(), MyProcess()]

for worker in workers:
    worker.start()

while True:
    # do stuff, using hgets, hsets on rdb, rdb2,...

And

# myprocess.py
from multiprocessing import Process 

class MyProcess(Process):
    ...
    # An instance of MyProcess inherits its own copy of rdb and rdb2
    #     when start() is issued for the process.
    def run():
        # do stuff, using hgets, hsets on rdb, rdb2,...

In fact, all of the ResponseError and other oddities I was talking about show up in the sys.stderr only for the myprocess.py code and not for the server.py code.

I will make some changes so myprocess.py uses its own brand new StrictRedis object to see if that helps.

@andymccurdy
Copy link
Contributor

Ah, I haven't tested with the multiprocessing module. Please let me know how your testing goes. I bet this is something we can fix as well, I'll just have to look into how multiprocessing works behind the scenes.

@namoopsoo
Copy link
Author

Well, I am happy to report that as long as fresh StrictRedis instances are used by each process and not inherited as with a Process from multiprocessing, then everything works fine. Such a relief.

I actually did not intend to use it that way, its just that I was trying to hide the StrictRedis() initialization a bit in a separate module, with something like,

# storage.py
rdb_is_setup = False
def setup_rdb():
    if not rdb_is_setup:
        rdb_cnxn = redis.ConnectionPool(host='localhost', port=6379, db=1)
        rdb = redis.StrictRedis(connection_pool=rdb_cnxn)
        rdb_is_setup = True

And I ended up re-using rdb objects created by other processes unintentionally.

But now I think the least error prone way to use this is with a context manager, just like with any other socket.

@andymccurdy
Copy link
Contributor

Good to hear. I'll take a look at the multiprocessing module and see if I can get things to play nicer with it.

@andymccurdy andymccurdy changed the title Getting 1L from lrange() intermittently; also "unknown command '2014-06-17'" from hgetall() redis-py doesn't play well with multiprocessing Jul 2, 2014
@bmerry
Copy link
Contributor

bmerry commented Jul 14, 2017

Is there any update on this? I'm also using redis with multiprocessing, and while I haven't yet hit any issues, I'd sleep better knowing that it was safe.

From what I can see, ConnectionPool will reset itself if it detects that the PID has changed, which ought to address the multiprocessing case. On the other hand, that was already the case with 2.10.1, so presumably there is something else that led to this bug.

@andymccurdy
Copy link
Contributor

3.2.0 fixes the issues with forked processes. Sorry for the long wait.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants