Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] NATS engine idea: Presence and History supported via the new NATS Jetstream KV store #568

Open
gedw99 opened this issue Oct 18, 2022 · 6 comments

Comments

@gedw99
Copy link

gedw99 commented Oct 18, 2022

Is your feature request related to a problem? Please describe.

I have an existing NATS Jetstream project in production and want to use Centrigo as a Gateway for Web and Mobile clients.

Describe the solution you'd like.

What would the feature look like? How would it work? How would it change the API?

https://centrifugal.dev/docs/server/engines#nats-broker

We could use NATS KV for Presence and History i think. I am already using NATS KV and its quite stable.

https://docs.nats.io/nats-concepts/jetstream/key-value-store

  • also has history !
  • also has watch

https://docs.nats.io/nats-concepts/jetstream/obj_store/obj_walkthrough

  • maybe possible to patch JSON in side it for a list of things mapped back to a key.

...

@gedw99
Copy link
Author

gedw99 commented Oct 18, 2022

Notes from telegram chat:

Yep, we have in-memory, in-Redis and in-Tarantool implementation for those at this point. Presence and Broker/History described by these interfaces:

They both have some semantics behind, personally I don't see a clear path to implement those with Nats K/V, but maybe it's possible

@gedw99 gedw99 changed the title [feature] NATS engine with Presense and History supported via the new NATS Jetstream KV store [feature] NATS engine idea: Presence and History supported via the new NATS Jetstream KV store Oct 18, 2022
@FZambia
Copy link
Member

FZambia commented Oct 18, 2022

@gedw99, to summarize what we started to discuss in Telegram: I don't see a clear path to match Centrifugo semantics with Nats K/V for both history and presence. There are some good things in Nats K/V: like in-memory storage option and possibility to load key history – but it's not enough for a reasonable Centrifugo engine. There are some considerations which do not fit well in my opinion:

  • For history: we can't Publish to Nats and save message to Nats K/V atomically, and properly increment publication offsets. Or maybe we can – but this will require distributed lock.
  • For presence: I am not sure how to load presence given the fact Nats K/V does not allow asking many keys by prefix

Think it's possible to build sth working since K/V is a concept generic enough to create any kind of storage, but whether this is an efficient and desired approach given the requirements of current Centrifugo Broker and Presence interfaces – not sure, I tend to be skeptical at this moment as both points above seem very critical to me.

In Centrifuge library (https://github.com/centrifugal/centrifuge) Broker and PresenceManager are both pluggable entities – so theoretically you can try building POCs and keep implementations outside of the library core. I would be very interested to look and find out I was wrong in some assumptions – I always wanted to integrate with Nats more closely and spent enough time thinking about the possibility to inherit Jetstream storage for history somehow.

Ideally I'd like Centrifugo to be a Nats consumer only and allow users publishing messages to Jetstream instead of Centrifugo. So Centrifugo would be just a transport/permission proxy before Nats. Streams in Jetstream already have offsets, but the missing part here is how to achieve reliable message recovery. Jetstream streams do not have admin API to iterate over – sth that Centrifugo needs to recover messages missed by offline client at the moment. I asked about this in Jetstream before BTW - nats-io/jetstream#266, was not implemented unfortunately.

I think that theoretically we could introduce a mode in which (instead of creating a Hub with channel subscribers and trying to efficiently broadcast messages to them and use recovery based on possibility to iterate over channel stream and synchronizing stream position to compensate at most once nature of Redis PUB/SUB - like we currently do) we could create individual subscription for each client connection in namespaces with history ON. So that we could manage offsets individually. But – it's not efficient since we will create Nats consumer on Centrifugo side for each client and for each channel. Will require MUCH more RAM and CPU I afraid, so it will be hard to scale towards many connections and many active channels. In this perspective having Centrifugo and just publishing messages from separate/external Jetstream consumers towards Centrifugo server API seems a more natural way to integrate.

@gedw99
Copy link
Author

gedw99 commented Nov 21, 2022

Sone of the things you bring up are already done in dendrite to sone degree .

https://nats.io/blog/matrix-dendrite-kafka-to-nats/

https://matrix-org.github.io/dendrite/

https://github.com/matrix-org/dendrite

dendrite is the server for Matrix and it uses NATS jetstream to provide real tine chat and other things. There is a lot there.

Its external clients don’t use nats and so they also have a centrifugo like system doing that layer.
PineCone is one of their proxy layers . It’s a p2p style proxy.

it’s worth looking at. The code is very clear and well done.

I am on mobile and willl have a deeper look at all the aspects you raised in the following days. But I just wanted to mention Dendrite in case you don’t know it !

@gedw99
Copy link
Author

gedw99 commented Nov 24, 2022

@FZambia , you came to the conclusion that:

“ In this perspective having Centrifugo and just publishing messages from separate/external Jetstream consumers towards Centrifugo server API seems a more natural way to integrate“

Sounds like a good approach !!

But then what about the security problems. Nats has its own security and centrifugo also has its own security.

Is there a way to turn off all security in centrifugo ? This could make it work since only nats would be sending messages to centrifugo .

@FZambia
Copy link
Member

FZambia commented Nov 27, 2022

Is there a way to turn off all security in centrifugo ? This could make it work since only nats would be sending messages to centrifugo .

Depending which part of security you mean. I suppose for publishing messages from Nats to Centrifugo server API will be used. Server API of Centrifugo protected by API key, and it's possible to disable using API key by setting api_insecure option to true. In this case API endpoint should not be exposed to public internet of course. Server API endpoint is running on internal port by default in our Helm chart BTW. And we currently suggest protecting it by firewall rules if possible.

But still, Centrifugo checks permissions when client connection subscribes to channels. It's possible to allow every authenticated client to subscribe to any channel using namespace options. But actually Centrifugo has many ways: https://centrifugal.dev/blog/2022/07/29/101-way-to-subscribe

But still, Centrifugo checks authentication of client connection. Connection should come with valid connection JWT or Centrifugo may proxy connect request to some backend to validate it. In most cases it's pretty important to know user ID of the client connections. But in general it's possible to generate some non-expiring JWT for anonymous user and connect using it. Or use the option like https://centrifugal.dev/docs/server/configuration#allow_anonymous_connect_without_token - so Centrifugo will treat client connections as anonymous user connections.

@gedw99
Copy link
Author

gedw99 commented Dec 1, 2022

that's really helpful . thanks. you have broken it down into easy layers of where i could integrate.

  1. api key heck. I could write a provider to do that against NATS

  2. auth. i could write a provider to check for auth against nats. NATS is JWT based anyway too.

  3. authorisation. i could write a provider that does this against nats. Nats holds al this info anyway. Its just a matter of making the NATS channels i think.

I need to allocate some time to check this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants