Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less reliance on global_id #188

Closed
wants to merge 6 commits into from
Closed

Less reliance on global_id #188

wants to merge 6 commits into from

Conversation

benlangfeld
Copy link
Collaborator

@benlangfeld benlangfeld commented Dec 3, 2018

These are backward incompatible changes. This change will break discourse, which uses the global_id for some sort of dashboard, but I'm pretty sure it could use the message_id instead since it's only interested in one particular channel.

See individual commit messages for details of the motivation here. More specifically, this is to support removal of the ID counter keys for Redis Streams support, so that we can use the native automatic IDs that Streams provide, in which case we cannot have a circular dependency between the global and channel-specific storage; we must publish to the channel-specific storage first to obtain the ID which is to be included as a pointer in the global storage later.

Also moves some message ID math down into the backend implementations so that the rest of message_bus is less coupled to sequential integer IDs.

There's nothing useful they can do with it anyway.
…acklog

Will now report -1 any time a message comes directly from a specific channel. It is not possible for an arbitrary backend implementation to efficiently provide the `global_id` of messages fetched from channel-specific storage mechanisms.
@benlangfeld
Copy link
Collaborator Author

Besides this initial impediment to using Streams' native IDs, we have the problem that they are not strictly numeric and are certainly not sequential integers, as message_bus currently assumes IDs to be.

Even with a complex representation as in Streams' native IDs, IDs can be considered ordered. Math on Streams' IDs can be implemented for the simple cases where tests simply require "an ID greater than X" or "an ID greater than Y" quite easily. Measured look-back is also implemented in a backend-specific way, and while on Redis Sorted Sets this must be done by calculating some ID range (based on the assumption of sequential integers), on Redis Streams (and others) it may be done by reading X messages from the stream sorted in reverse order.

My proposal is for message_bus to cease to guarantee that message IDs be sequential integers, though in practice on some backends they might be, and provide opaque IDs over the network (may be defined as String), and complex IDs (a MessageBus::MessageID) to Ruby client code which implement comparison operators but are not guaranteed to be strictly numerical.

@SamSaffron What do you think?

@benlangfeld
Copy link
Collaborator Author

Actually, there's other places where IDs are assumed to be sequential integers, independent of the backlog in use:

I believe these can be moved to the backend, and I'm working on that now.

Also probably improves performance by fetching full backlogs less often.
Subscribing from some future point is unnecessary and heavy to implement. See ae0d7cd#commitcomment-31541983
@SamSaffron
Copy link
Member

Hmm how do we do firehose global server side consumers without a global id? How do we reliably catch up globally?

Can redis do stream publishing in LUA and then come with some guarantees of both ids lining up and messages having a consistent global and local id?

Regarding breaking the API to support other types of identifiers, I am not in principal against but it is an enormous and painful change if we are to amend the apis, I would like to avoid that. So stuff like subscribe('/channel', -5) should still work even post a change like this, even though the api is not the cleanest in the world. The tricky thing is ... what if your id happens to be -5 in some backend. I definitely don't want to be wrapping every id redis and pg send back in some sort of MessageID class, it feels very wasteful to be allocating all this stuff, but maybe the stream implementation can simply return objects that "behave" like numbers so we can keep the old working just as it did and not introduce new objects.

Regarding pushing stuff that ought to be in the backend in to the backend, I support this, BUT, we should be careful here with regards to copy and pasting this block of code 3 times, so if the base backend needs to implement and then redis stream is stuck overriding I guess that is ok.

There is one huge fundamental problem here though, I think you need to do some rough numbers of how much faster stream is going to be. If after we end up doing all this work we end up with a backend that is no faster than what we have today, this ends up being a pure "intellectual" exercise vs something generally useful. We got to get numbers here.

@benlangfeld
Copy link
Collaborator Author

Hmm how do we do firehose global server side consumers without a global id? How do we reliably catch up globally?

I'm not proposing removing the concept of global IDs, but simply not exposing them to clients subscribed to specific channels. With the changes here, global IDs are not exposed to Javascript clients, and are only exposed to server-side application code when acting on the global backlog (MessageBus.backlog(nil, some_id), MessageBus.blocking_subscribe(nil)) and are still used internally for server subscriptions.

Can redis do stream publishing in LUA and then come with some guarantees of both ids lining up and messages having a consistent global and local id?

No. Insertion into the stream is an atomic operation which internally assigns an ID (in the format [timestamp]-[counter]) and returns it. We thus don't know the ID of the message until it's in the stream.

Regarding breaking the API to support other types of identifiers, I am not in principal against but it is an enormous and painful change if we are to amend the apis, I would like to avoid that. So stuff like subscribe('/channel', -5) should still work even post a change like this, even though the api is not the cleanest in the world. The tricky thing is ... what if your id happens to be -5 in some backend. I definitely don't want to be wrapping every id redis and pg send back in some sort of MessageID class, it feels very wasteful to be allocating all this stuff, but maybe the stream implementation can simply return objects that "behave" like numbers so we can keep the old working just as it did and not introduce new objects.

If we define the IDs to be backend-specific types (which can all be converted to strings for the wire format), the Redis and PG can continue using Integers, while Streams can use their native ID strings. I'm pretty sure we can make specifically subscribe('/channel', -5) work regardless of the ID type, but IDs would not be portable between backends and one is not guaranteed to be able to do math on IDs in Javascript.

Regarding pushing stuff that ought to be in the backend in to the backend, I support this, BUT, we should be careful here with regards to copy and pasting this block of code 3 times, so if the base backend needs to implement and then redis stream is stuck overriding I guess that is ok.

Yeah, I managed to mostly avoid that, check out the later commits on this branch.

There is one huge fundamental problem here though, I think you need to do some rough numbers of how much faster stream is going to be. If after we end up doing all this work we end up with a backend that is no faster than what we have today, this ends up being a pure "intellectual" exercise vs something generally useful. We got to get numbers here.

Yes, I'm concerned about that also. If you want to hold off on merging this until Streams has some numbers, that's fine, but unfortunately this is all pretty necessary for performant usage of Streams (see #173 (comment); Streams' native IDs solve this problem).

@SamSaffron
Copy link
Member

Yes, I'm concerned about that also. If you want to hold off on merging this until Streams has some numbers, that's fine, but unfortunately this is all pretty necessary for performant usage of Streams

Understood, but we can do some basics to test the underlying structures, in particular compare native implementations of both patterns in a standalone ruby program. The lua script captures the whole publish part.

Once that is done maybe you do the refactoring in the redis stream branch till you get to the state you can do full benchmark?

@benlangfeld
Copy link
Collaborator Author

If we make this change we need to rework the consumers in Discourse ... in particular...

Discourse only cares about the global ID here: https://github.com/discourse/discourse/blob/96168cb3c6b06e2bc8683300b532790ae5a9666f/lib/site_setting_extension.rb#L314, though I am not sure why it's using the global_id and not the message_id; I don't know anything about Discourse, so perhaps you could shed some light on that @SamSaffron ? It was introduced at discourse/discourse@a2cca25#diff-ed2fbbcb9305eeb6e1536cf4413a641cR118.

say we are subscribed to site settings and then someone does redis-cli flush we want to make sure our consumer can deal with this.

Indeed there are tests for that in message_bus' test suite.

I am somewhat concerned about removing this cause I think there were some real world issue this circumvented

Honestly I cannot imagine what it might be.

Understood, but we can do some basics to test the underlying structures, in particular compare native implementations of both patterns in a standalone ruby program. The lua script captures the whole publish part.

Yes, I will do this.

When client code tries to subscribe to a channel using a `last_id` that doesn't exist on that channel, we raise an exception.

See ae0d7cd#commitcomment-31544627 for details.
@ZogStriP
Copy link
Member

Thanks @benlangfeld for all the hard work 👍

@SamSaffron do we still want this?

@SamSaffron
Copy link
Member

Since we are kind of giving up on redis streams for now I am shelfing this. The change is just too big and risky.

@SamSaffron SamSaffron closed this Sep 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants