-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using gRPC as an externally exposed API #271
Comments
Are you suggesting gRPC service that uses the C++ lib as an alternative to the Web Sockets service? |
When we started the project gRPC was not available yet so we went with custom protocol. I think it would be great to offer a gRPC based interface for better integration, though I would see that as an additional layer, such as the WebSocket proxy (which can run embedded in the broker or as a separate component). One of the primary goals of the custom binary protocol we came up with, was to have the client establish a "session" (either producer/consumer), attached to a topic, perform authentication and then let it publish/receive messages as fast as possible, with flow control to guard the rail. Eg: we don't want to perform auth at every message, or to specify the topic name each time (which in many cases can be as long as the data itself). So, mixing the "session" with RPC seems a bit complicated. Also guaranteeing ordering would be challenging as there would be no relation for different publish requests on the same topic. Having said that, I'd be really happy to have a gRPC based proxy service. Contributions welcome! 😄 It may be also interesting to offer the same interface (or at least a significant portion of it) as GCP pub-sub: https://cloud.google.com/pubsub/docs/reference/rpc/google.pubsub.v1 |
Having gRPC as an alternative to the Web Socket service would be awesome. gRPC has built-in support for bidirectional streams - so basically it could be seen as a session (it's using HTTP/2 streams), no? Guaranteeing the message order shouldn't be a problem then. I think it's also important to know if pulsar is going to loosen the message ordering constraints (at least for such an "additional" layer) because it would make things easier (for use cases where ordering is not important - in the same way as GCP pub/sub doesn't guarantee any order). |
agreed. I believe ordering is not a problem with gRPC bidirectional streaming. It is actually very easier and super fun on using gRPC bidirectional streaming. I think the most interesting piece here is to add a GCP pub/sub proxy with gRPC protocol.
in the context of "shared" subscription, the message ordering constraints are already relaxed. that said you can use |
Whole Clusters run on grpc like k8s + istio and it's well supported by proxies like envoy. You could think about clients talking to their user's topics directly, authenticated by ingress - e.g. envoy filters (see https://www.envoyproxy.io/docs/envoy/latest/configuration/http_filters/grpc_web_filter for ideas). IMHO, I'm not convinced about GCP pub-sub. Making one thing like the other is almost often a large effort and a bad fit if you look into details. When google changes the api, do you follow? |
I would love to see this! I can understand about pub/sub. It really doesn't serve the same purpose as Pulsar or any distributed log. Different use cases. |
+1. Supporting gRPC would give access to a lot more clients, integration with reactive frameworks (like RxJava or Reactor), provide application-level flow control, etc... |
gRPC can establish a session for bidirectional streaming ! So IMO it could totally be used as the base protocol for Pulsar. The definition would look something like service Pulsar {
rpc exchange(stream BaseCommand) returns (stream BaseCommand);
} Instead of passing auth info via That said I have started the work on a gRPC proxy. Consumption and production are working. I need to clean it up then I'll do a PR. |
And if I'm not mistaken BookKeeper uses gRPC internally, so it would be coherent to make it the base protocol in Pulsar also. |
Look forward to your PR.
gRPC is used only used for bookkeeper's table service. but the ledger service is still using custom protocol. but agreed with you, gRPC has very rich ecosystem, it is a good direction to good in general. |
@cbornet The reason we haven't used gRPC is that it wasn't available when we started, so we went with custom protocol over protocol buffer. After that, migrating the internal protocol was a big step. |
Yes. That's a very good reason indeed. Maybe in the future 😄 . I can understand there are bigger priorities. |
Any progress likely on this ? |
@mickdelaney Pulsar has a python client and there is an ongoing development of dotnet client. Does it meet your requirement? Or gRPC is your preferred option? |
Hi, So we use Kafka at the moment, confluent provide dotnet & python clients, based on librdkafka which The reality is that its very expensive to maintain all these language drivers, and so you get differences, you get things that are coming down the line, for example the schema/avro support in the various languages for kafka varies significantly, Java being very different than say C#. So for teams using these drivers, you have to rely on different semantics, you have to create different approaches to dealing with things like schemas, and it increases costs. Also you have to think about the teams providing the drivers, and the costs they have in maintaining them. Its not easy. So if there's a possibility that GRPC will fit the semantis of pulsars protocol, it seems to me that its a win for everyone, the pulsar team in particular can focus they're attention on making the GRPC layer first class. Thanks... |
We have real-time messaging system implemented in Erlang, and looking at pulsar as a pub/sub /queue message broker. Unfortunately that means implementing our own client lib with tons of features on top of binary / protobuf protocol. Having gRPC support would have greatly helped |
@mickdelaney @TC-oKozlov thank you for your input. just to understand a bit more about the requirements, are you expecting a gRPC based proxy or pulsar broker protocol exposed in gPRC? This would lead into two different approaches. A gRPC based proxy means providing a much simpler protocol than the current broker protocol. So it is easy to have different language gRPC clients. But it will has its own limitations and drawbacks, such as another network hop, and some of the features might be hard to support and etc. Exposing pulsar broker protocol in gPRC can solve the problem in handling wire-level request & response encoding and decoding. However the challenge of implementing a Pulsar client is not about handling wire-level encoding and decoding. It is more about the logic within a Pulsar client, such as flow-control, topic lookup, error handling and etc. So we will still be facing the same challenges that current Pulsar client is facing. It is probably even worse than implementing language client wrapper using Pulsar c/c++ client, because implementing a language client wrapper is much simpler and less error prone than re-implementing flow-control, topic lookup and error handling in different languages. I would like to collect more requirements of gRPC to understand what is the right approach for solving the problem here. |
I think moving to gRPC for the Pulsar clients would have some benefits. For instance it already handles flow control and bi-directional streaming. For those who want to write native clients, that's a layer less to develop. |
@sijie thanks for the detailed feedback. i was thinking of the former, my thinking being that it would atleast remove some of the concerns in maintaining the various language level clients. |
Since v2.7.0 has been released, you can now use the gRPC protocol handler which implements PIP59. |
New pre-release with full transaction support : https://github.com/cbornet/pulsar-grpc/releases/tag/v1.0.0-20201206-rc |
@cbornet can you provide some guidance regarding how to use the grpc protocol? I only saw binary protocol in http://pulsar.apache.org/docs/en/develop-binary-protocol/ . |
Closed as answered by #271 (comment). New questions or issues can be created separately. |
(cherry picked from commit 5ed7dd1)
grpc (http://grpc.io) has ready-made clients for Java, C++, Go, Python, etc. So Yahoo Pulsar clients would not need to reimplement efficient clients in all the languages (currently exposed websocket interface does not support all the methods provided by the protobuf based protocol, has lower performance and requires creation of a separate websocket connection per topic publisher/consumer).
The text was updated successfully, but these errors were encountered: