-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement resource based routing feature #2535
Conversation
Hi @AVaksman. Can you explain what the benefit and use-case of this is? Was it requested by the spanner team or something you believe would be useful? Happy new year! |
Requested by spanner team.
|
This PR is also uses FieldMask feature exposed in PR #2530 |
And a Happy New Year! |
Thanks @AVaksman, I've taken a few minutes to understand what we're doing here now. :) I have one rather large concern which we should discuss before fully reviewing. The implementation you have here will add 1 RPC call to each instantiation of the library. Since PHP doesn't have long-running processes and every request requires instantiating the library, that adds up to a lot of additional RPCs. I'd venture to guess that it would more than offset the gains from using a more optimized endpoint. We need to find a way to cache this across requests to prevent having to make API calls every time the client is created. edit: Also, I see the |
In the proposed implementation the instantiation of |
Right, but in high-traffic applications, if you have 1000 visitors, that means 1000 instantiations of the client and 1000 additional RPCs that this change would add. I find it unlikely that the benefits of this change outweigh the added cost. What we need is to cache the Here's what I'm thinking: Accept an additional configuration option on @dwsupplee do you have any thoughts here? |
This seems like a reasonable thing to implement. I agree with @jdpedrie the network overhead here might end up negating some of the usefulness of the feature, however. Using a cache seems reasonable, or I could also see providing an environment variable which hosts a list of endpoint uri's a user could determine and provide themselves. Whatever we land on, it is definitely desirable to avoid always needing to make this network request. Do these endpoints change regularly, and that necessitates the network request? Or can we reliably pre-determine them? |
Thanks @jdpedrie and @dwsupplee for your inputs here.
Let me try to understand the problem first. So you're saying that a customer using the client would instantiate a new client for every request and then close it at the end of the request, rather than keeping it around, therefore for every request, we would need to make two RPC calls. I'm a bit confused by this since I imagined a customer application would create a single client connection at the start of the application and then reuse it for serving multiple requests assuming that all the requests are made to the same database. I believe that's why we initialize a session pool with multiple sessions so you can serve concurrent requests. Let me know if I misunderstood something.
In any case, I'm not opposed to using a cache. That was actually the original design but some of the client lib owners for other languages said that a cache is unnecessary given that a client connection is only made at the start of an application so it's just one request to figure out the endpoint and then the client is reused for the lifetime of the application. But maybe the PHP implementation is different to those of other languages. If a cache makes more sense for PHP, then I agree that we should implement it. I don't think the actual endpoints will be changing frequently. But the design from the backend team is for the client to rely on what the backend provides as the endpoint rather than statically declaring the endpoints. So I don't think the idea of an environment variable would work here. We also don't want users to have to decide on which endpoint is appropriate because the backend is in the best position to decide that based on the location of their data. If users decide they want to use a particular endpoint, they can do so by just specifying it in the options when they init the client so no env variable is necessary in that case. |
Sorry, let me define terms here. In this case, I'm speaking of an HTTP request to a PHP application. Since PHP does not keep state across multiple HTTP requests, each time a user visits the website, we will construct a new PHP instance, along with all the objects (such as the Spanner client) it needs. That means that each user request will add one additional RPC to the total required for the program to communicate with Cloud Spanner. PHP is different from other languages like Java or Go in that programs written in those language stay alive and listen for incoming HTTP requests. PHP shuts down after each request, or at the very least, does not preserve state. So we couldn't keep the endpoints in memory across multiple requests as one could with a language with a different execution model. |
Thanks @jdpedrie, I understand the issue much better. In hindsight, I should've realized that. I talked to @AVaksman and decided on the following design:
I hope that makes sense. Let me know if you have any questions. @AVaksman will work on these changes over the next week. |
@@ -141,7 +141,9 @@ public function __construct(array $config = []) | |||
'projectIdRequired' => true | |||
]; | |||
|
|||
$this->connection = new Grpc($this->configureAuthentication($config)); | |||
$config['enableCaching'] = 'true' == strtolower(getenv('GOOGLE_CLOUD_ENABLE_RESOURCE_BASED_ROUTING')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be GOOGLE_CLOUD_SPANNER_ENABLE_RESOURCE_BASED_ROUTING
@@ -65,6 +66,17 @@ public function setUp() | |||
]); | |||
} | |||
|
|||
public function testResourceCachingEnvVar() | |||
{ | |||
$this->assertTrue(putenv("GOOGLE_CLOUD_ENABLE_RESOURCE_BASED_ROUTING=true")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be GOOGLE_CLOUD_SPANNER_ENABLE_RESOURCE_BASED_ROUTING
What's the status of this PR is relation to the recent deprecation of the field in question? |
|
I mostly agree but the backend team did ask us to keep the PRs open for now in case they change their minds. If it's ok, I'd like to keep it open for a couple of weeks and then I'll check back with them on whether it's fine to close it. But if you think having the PR is creating too much noise, we can close it and @AVaksman can push a new branch if we need to reopen. |
Sure, we can leave it open for the time being! |
We have decided to implement this functionality on the server side so we no longer need to add this support on the client side. |
Implement resource based routing.
endpointUri
SpannerClient
using instance specificendpointUri
asapiEndpoint
SpannerClient
(from pool) for all data calls of the said Instance.