-
-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding subscriptions #92
Comments
Why do we need additional configs? Can't we discover the channels from the GraphQL subscriptions? Let's say the client does something like
This would tell Postgraphql that there is a channel called |
There is a bit of discussion and some examples for inspiration here: |
Also it could perhaps be beneficial to utilize a pre-made websocket implementation such as https://github.com/apollostack/subscriptions-transport-ws |
Slightly unrelated, but also related slightly is the current lack of ability to send emails without having to manually setup a microservice which subscribes to Postgres events. An alternative could be having PostgraphQL calling out to hooked files... |
I plan to be working on this feature as I'll be using this module in an up coming project, which will need subscriptions. Have you had any thoughts on the implementation? I'm new to this codebase and would love to contribute with a solution. |
@RpprRoger Very open to you submitting a spec which we can iterate on before attempting to implement. I keep meaning to do this myself but am barely staying afloat! |
I think a good approach to both #92 (comment) and #92 (comment) would be a straight-forward way (maybe some sort of sanctioned npm module) to hook into pg events. |
I like where this is going. Rather than a separate module, there could be
postgraph_eventname events and an `events` folder with eventname.js files
which get passed the notify payload and that's it.
…On Tue, Jun 6, 2017, 05:56 Burkhard Reffeling ***@***.***> wrote:
I think a good approach to both #92 (comment)
<#92 (comment)>
and #92 (comment)
<#92 (comment)>
would be a straight-forward way (maybe some sort of sanctioned npm module)
to hook into pg events.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#92 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAmxIvtyTCPuaLu_8mPcXiLbe5KHLuUcks5sBSJigaJpZM4JfHl7>
.
|
I would recommend against using My first stab at a multi-vendor tool to stream events from a database can be found here: How to emit events that are not "rows": Logical decoding allows us to:
We should try to shed load away from the database at all costs, so nchan seems like a good fit here. To protect against unauthenticated DDoS attacks, you really don't want to be doing auth against the database, so putting NGINX/OpenResty in front and using Redis for auth/sessions and pub/sub makes good sense in production. (nchan does not required Redis, but it can use it to scale out and persist the configurable message queues). This is a lot to cover in a summary, however, I assure you, it checks the boxes if you can deal with the dependencies. I feel like building all of this into Postgraphql would be feature bloat and unwise. FOOT CANNON WARNING: PaaS and hosted environments: Related resources: Due diligence? |
Has anybody tried to build some sort of proof of concept? Do you people prefer the solution based on NOTIFY or logical decoding? I’m a big fan of this general idea, it would make this project really versatile for all sorts of use-cases. |
Would love to see this happening. I'm working on a realtime POC from Postgres -> wherever and getting subscription would really be a killer feature to make it happen |
Given the stated limitations of NOTIFY/LISTEN, especially the need to scale out to multiple databases, it might be wise to implement this as @jmealo described. That said, it would likely be best as a "Subscriptions" graphile-build plugin and this could be disabled if desired. |
My current plan is actually to use both. The standard CRUD events would be logical decoding, but you'd also be able to use listen/notify to trigger custom events. These custom events cannot be auto-detected so my plan is to just have a generic API for them. I've also considered letting the client hook into this generic API so it can trigger events too; I've not 100% figured the security requirements around that yet (e.g. To prevent DOS) or what the payloads would be (at the moment I'm considering null payloads so you just go in via "query"). I have notes on paper but nothing solid at this point. |
(Also I plan to require redis as the pub/sub mechanism and to effectively relay notifications though that from the DB listen/notify; would this cause anyone issues? I was thinking if a redis server was not specified we'd simply disable subscriptions. This also allows for us to add caching and cache-eviction in the future.) |
I would prefer a solution without a Redis server. Systems with fewer dependencies are always better. |
I think it is a good idea to use Redis, as it is the standard for GraphQL subscriptions right now and works well for custom servers. Trying to run without Redis or something similar will potentially make it very hard to scale such a soltuion. |
@MrLoh: @ansarizafar: @benjie: @chadfurman: @simplesteph: @arnold-graf: @JesperWe @RpprRoger: Would you be interested in forming a working group regrading real-time event streaming from PostgreSQL [admittedly, I'd like this to be "frontend" agnostic, meaning it will work with postgraphql, postgrest, or your own APIs]? @ansarizafar: I went the Lapidus/Socket.io (tiny side car service) route and a lot of the things I was doing to avoid Redis created a tremendous amount of WAL writes and "churn" (this is bad, especially on IO constrained VMs). I applaud keeping dependencies and complexity down, but it's only superficially more simple. I had a Koa-based API that had all "lookup tables" cached in memory. Sure, everything was super fast, but... you'll find your caching code starts to approach that of your application code and you'll say to yourself "wow, all of this to avoid Redis?!" ... YMMV I have done a ton of R&D and experiments weighing the pros and cons of adding dependencies. I'd really like to get some feedback on what kind of messaging guarantees you guys want and how you handle auth/sessions/caching. I'm not as familiar with GraphQL subscriptions as I am with "all you can eat firehose" streaming from your database but I'd assume the same mechanics apply. TLDR; For production usage: don't put your sessions/auth/pub/sub in PostgreSQL it will not allow you to scale features independently or protect your production data. You'll be stuck in "throw hardware at it" or refactor mode. PG10 lets you shard/replicate so you could manage a bunch of separate PostgreSQL servers... or you could just add Redis now and have a polyglot stack where you're not backed into a corner✝. ✝ If this is a small project and security is not an issue and you want to keep things simple, disregard this advice :). Do not over engineer. If you plan on using some Go, Node, Python, Rust and making a stack out of open source software I definitely think it's worth doing your sessions and auth outside of your application stack so you have flexibility moving forward.
|
If you do an internal subrequest in NGINX it doesn't have the overhead of a network connection. Managing pub/sub, auth and sessions from within NGINX really helps keep latency and resource usage down.
If you keep sessions and auth you can do a security definer function and prevent any privilege escalation (even with arbitrary SQL queries!!) with a properly configured database and RLS rules. The nice thing is, if you don't have a monolithic stack or it's polyglot, you can use a single real-time, sessions and auth mechanism separate from your app code (it's really great, it just feels right). If you want to bind SSR to a "materialized view" (pre-rendered web stuffs, not the db, however, if you want automatically refreshing materialized views, I've done that too, but it requires the entire "table" to be rewritten which will thrash your WAL -- it's better to write a routine that only rewrites the rows that have changed. This is actually really trivial the way I've done it.) Anyway, these patterns make it so you never touch your database except to do mutations. If you're in an event-only system and want to do CQRS or you're into mostly append-only tables this is as close as I could get us to develop nirvana. That being said, if I had a bit more time, I believe you could do all of this in Tarantool and get really great performance too... HOWEVER, my goal prior to discovering Tarantool was to release a lapidus-type plugin for each language so you could keep your data in memory (whether that be node, php-fpm, or some kind of shared memory slab). |
I'm definitely interested in using a shared solution rather than reinventing the wheel! At the moment I don't have time to join a working group on this though - my priority at the moment is to get v4 out. |
What's the quickest path to have subscription support? Websockets in a plugin? |
This looks quite relevant: https://github.com/socketio/socket.io-redis |
we'd also want to make sure the session token is carried over: https://stackoverflow.com/questions/4753957/socket-io-authentication Which means we'd likely want to send our auth token as a cookie (header auth won't work with sessions). Might as well make it HttpOnly and SameOrigin while we're at it (if not by default, then with flags). There's a ticket for exactly this: I'm working on #501 next week, will send a PR with what I come up with. @benjie has been very helpful in guiding me there. |
You can use socket.handshake.headers to extract the authorization header; websockets send standard HTTP headers as part of the negotiation before upgrading to websocket protocol. Cookies come with their own set of CSRF and similar issues, so if we go that way we will need to ensure those issues are addressed. (Not that JWTs aren't problematic! But I'm planning to deprecate them in v4 and replace with a largely compatible yet more secure alternative. Still in the research phase with this.) |
Cookies do not add risk of CSRF if implemented correctly. Using SameOrigin policies and putting your API on a subdomain, for example. Moreover, HttpOnly cookies protect the JWT from XSS attacks. Also, double-submit tokens that the API generates and the client submits twice -- once in the body of the request, and once in the HttpOnly, SameOrigin signed cookie that the API set. This means that an attacker triggering a CSRF could potentially include the double-submit in the body of their request, but getting it into a signed, SameOrigin cookie is going to be very very difficult. |
It's not uncommon to run APIs on a separate subdomain (or even domain) and allow multiple parties to interact with them; this is in-part why we have the |
The goal here is to have postgraphile generate "subscription" graphql schema as well as to create and maintain the subscription channel. Moreover, we need a way for a trigger in the database to emit an event to the browser clients. For super-simple custom subscriptions, I'd be fine with pushing a simple eventName down a pipe and letting the client-side fetch what it needs with a separate query after that. Certainly not where we want to end up, but might be a quick and easy stop-gap until we get there. |
Yeah, that's where I plan to start, just generic subscription infra with a field name like "subscribe" that you can pass a string key to, and optionally a "publish" mutation that does the reverse for peer-to-peer pub/sub. I'm not sure how to limit the subscription messages to only the people who are meant to be able to see them; I was thinking running the subscription name through a postgres function at the time of subscription for a yes/no and then periodically (at a user defined period, which could be set to never) revalidating. Another alternative I'm considering, and in fact leaning towards, is writing the subscriptions to a subscription table (id, topic_name, user_id, subscribed_since) such that the system (a trigger or whatever) could delete the row from the table which effectively revokes access to that subscription. That way a logout/kick from channel/password reset/unfriend could take immediate effect. Does anyone want to weigh in on this? |
I like the table idea, which might also be able to help through RLS I also feel a postgres function to "direct" the message (on publish) would be useful. Something custom so I can say, "msg came from user 2 who is in room 3 so everyone in room 3 gets this message" or "msg came from admin, allow it." vs "non-admin said 'start' but is not authorized" |
Potentially actively kill the websocket connections every period of time (e.g. 15 minutes or so) to force re-authentication. |
|
Problem: what if the event is informing us that the node was deleted? I think in that case the event would need to come along with |
Early on in one of my projects I was using RethinkDB changefeeds for realtime updates. I eventually went back to Postgres and decided to manually trigger subscription events via GraphQL mutations. I came across an interesting article where someone actually re-implemented changefeeds in Postgres and actually improved performance. There are links to the code in the article. I figured this might be helpful, as it seems like a proven and well-tested application of some of the ideas proposed here. While there is a difference between I'll admit, the implementation details are a little over my head, but I'm excited to see where this all ends up. Great work! |
I haven't read this entire thread, so there might be duplicate information and/or ideas that have already been shot down.
I'll add more comments as they come to me. |
@jmealo Do you have some links that I can read up on as to the reliability guarantees of pg_listen/pg_notify? I'm having trouble finding relevant info. I'd like to look over what's already out there before going through the source lol. Thanks! :) |
Probably the best place to read about them is in the docs:
Postgres only sends the notifications to connected clients, it does not cache them, so if a client loses connection it will not receive the notification. |
Ah ok, so it's pertaining to Postgres behavior by design. I misinterpreted and thought the implication was that it sometimes failed to deliver notifications to actively connected clients. My personal opinion is bridging between Redis or RabbitMQ from Postgres to queue up the notifications. If going with Redis, then nChan would be a great option (provided that delivery is reliable). One potential issue though is that nChan isn't really designed such that clients can directly PUB/SUB to messages from Redis alone (they want you to use the HTTP API). A positive to this approach though is that it's integrated into nGinx, which means that HA is easier because it's decoupled. If an upstream server, like PostGraphile, goes down, the client won't lose the WebSocket connection. With respect to RabbitMQ, I'm actually working on a plugin that subscribes to notifications from Postgres and publishes them into a RabbitMQ exchange. I currently like this approach, but we'll see how it goes once I'm done with it and have had a chance to see the pros and cons. I use PUB/SUB for both server-side task queues as well as client notifications. The former requires guarantees while the latter does not. I have both RabbitMQ and Redis clusters going, but the former has more flexibility and reliability for me (I don't want it to drop task events, but client events aren't mandatory), so I'm using that as the underlying service for message storage and delivery. Anyway, food for thought. |
I'm not aware of any issues with LISTEN/NOTIFY other than those pointed out in the manual text; I know in 2010 they were discussing issues that the notifications were not sent across the whole cluster (only to the DB node to which you were connected) but I've yet to determine if this is still the case in 2018 - hopefully they've solved it by now, and since the manual doesn't seem to mention it (or at least I've not found a mention yet) I'm inclined to believe this is the case until proven otherwise. (I'm planning to test this for myself at some point.) Here's the work queue I use with PostgreSQL / PostGraphile, it uses LISTEN/NOTIFY to get near-instant notification of new jobs, https://gist.github.com/benjie/839740697f5a1c46ee8da98a1efac218 |
Yup, that mailing list thread was the only thing that popped up when I was Googling. I'm also going to talk to some of the committers tomorrow for further feedback. I've seen your gist before and it looks pretty good :) One issue though for me is that it requires extensions, which usually can't be installed onto hosted providers. The best article I've read thus far on using Postgres as a job queue is this one, which digs into the innards of the implementation and shows benchmarks. I'm using Celery with RabbitMQ for my task queues and Redis for result storage and caching. I'm not sure what the best approach to PostGraphile subscriptions is going to end up being, but it should generally be independent of the storage system. |
Both pgcrypto and uuid-ossp are pretty standard extensions available on all Postgres hosting providers I've looked at (RDS, Heroku, Google, Azure, etc). Though come to look at it I don't think it actually needs pgcrypto, and it only uses uuid in one place to generate a unique name for the queue if one is not given by default - both extensions could be removed with a minimum of effort - thanks for the feedback! The post you reference is from 2015, before |
@benjie SKIP LOCKED is a god send, helps reduce the complexity :) What's the general status on subscriptions thus far? I figure a quick & current status update would be good to have. The comments on this issue are getting pretty long and some are pretty dated, so it'll save some some time. I'm open to contributing if I can as it's something that would be useful for my own use-cases in the coming few months. I mentioned in #523 that I'm currently using STOMP for push events, but I'd love to be able to use that as a secondary option and have the primary option be GraphQL subscriptions. I'm currently looking at potentially implementing a custom PubSub provider for graphql-subscriptions. |
I'm thinking about using gun instead of socket.io to for sending messages from db to the browser and react-native. Does anyone have any experience with this? |
I am a bit confused here -- from what I am reading 'gun' IS a DB. It's a graph DB to be exact .. so not sure what gun and postgraphile have to do with each other?
You would use gun or postgraphile (or even both if it makes sense depending on the data) .. but .. never gun WITH postgraphile ??
Am I missing something here?
… On Jul 1, 2018, at 10:56 PM, doorway metaheap ***@***.***> wrote:
I'm thinking about using gun <https://gun.eco/> instead of socket.io to for sending messages from db to the browser and react-native. Does anyone have any experience with this?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#92 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB8M7eE1hqHNlIqJwQLlIeoiqyuNKDrbks5uCYvmgaJpZM4JfHl7>.
|
I believe that the latest versions of PostGraphile and PostgreSQL now have the primitives required to implement this sensibly. I'm swamped at work right now but I'm cobbling something together for a prototype, I'll share a simple solution should I find one. |
Cool 👍 Chat with me on Discord about it and I’ll help you make it a server plugin so you can work on it without forking 🤘 |
@benjie Is there any way I can use subscriptions with apollo-server? I am using 'postgraphile-apollo-server' with nestjs framework, it works great for queries and mutations however I am not sure how to use subscriptions. I can write my own custom subscription but I would like to do it via postgraphile plugin. |
@JeetChaudhari I don't have enough experience with Apollo Server to know for sure; but have you tried creating a Subscription extension with makeExtendSchemaPlugin? Maybe it Just Works ™️? https://www.graphile.org/postgraphile/make-extend-schema-plugin/ module.exports = makeExtendSchemaPlugin({
typeDefs: gql`
extend type Subscription {
testSubscription: Int
}
`,
resolvers: {
Subscription: {
testSubscription: {
subscribe: () => {
// return async iterator here
},
resolve: d => {
console.log(d);
return d;
},
},
},
},
}); |
My current approach is to utilize Apollo schema stitching with link composition. I have a thin, root GraphQL server that stitches together other upstream APIs (both GraphQL & REST). One of those servers is an Apollo GraphQL server that's dedicated to subscriptions. The root server uses link composition (aka. splitting) to detect whether the incoming request is a subscription and, if so, routes it to the upstream subscription server. Regular queries & mutations go to PostGraphile (or other relevant servers). |
@benjie I tried but always I am getting the following error. { Maybe there is something wrong with my implementation. I will give it try with the example you provided and let you know. @xorander00 Thank you for showing the approach. I would try to get this done through the plugin but if it would take too much time, I would try approach provided by you. |
@benjie Thank You, it worked like charm, I was referring to https://github.com/nestjs/nest/blob/master/sample/12-graphql-apollo/src/cats/cats.resolvers.ts and as per example I wasn't writing resolve method, only subscribe. Here is my test subscription plugin.
|
Super happy it worked for you! You may want to remove the console.log, I just added that to help you debug 👍 |
Super excited to announce that we just released 4.4.0-alpha.0 which includes OSS subscriptions and live queries support. To install, use I'd love to hear what you think! Come chat in our Discord: http://discord.gg/graphile |
Thanks a lot for your hard work 🙂 |
Finally closing this, one of the oldest open issues in the repo, because 4.4.0 has been out for a while and seems to meet people's needs! 🎉 |
* Solve some of the array of custom type issues * Rewrite to lazy evaluation * Add note about omitting NonNull
In #87 I laid out a timeline for PostGraphQL, but I forgot to add subscriptions to that timeline! Once we’re done writing lots of documentation for PostGraphQL (step 2 in #87), let’s implement subscriptions to make PostGraphQL a viable contender for realtime API needs. I want to open this issue now because I’m not entirely sure how this would work, and I want help designing the feature. I know GraphQL JS has support for subscriptions, and PostgreSQL has a
LISTEN
/NOTIFY
command for building simple pubsub systems. I’m just not sure how to wire them up.This is especially challenging because (as I understand it)
LISTEN
/NOTIFY
is not statically typed, so this means events are:To make things even more tricky, in the default configuration
NOTIFY
payloads must be shorter than 8000 bytes.Here are my preliminary thoughts on implementation, but let’s discuss them!
I’ve talked to people before who’ve implemented subscriptions in their GraphQL API and they said that whenever their server would get an update they would rerun the full GraphQL query. Lee Byron also said Facebook did this in the Reactiflux Q&A. The quote is:
So we’ll do that. This means the 8000 bytes can be a Relay ID or something similar.
PostGraphQL would
LISTEN
to thepostgraphql
channel, where we’d expect information about types and a table primary key fromNOTIFY
s.But we still need somewhere to define what kinds of subscriptions PostGraphQL knows about on startup time, so do we define these in a JSON file? In CLI arguments? Is there a way to add metadata to the PostgreSQL database itself? Do we create a
postgraphql_subscriptions
table and expect users to register subscriptions there? Or would a table like that be completely internal?Thoughts?
The text was updated successfully, but these errors were encountered: