Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add pooled grpc transport #748

Merged
merged 134 commits into from
Apr 24, 2023
Merged
Show file tree
Hide file tree
Changes from 92 commits
Commits
Show all changes
134 commits
Select commit Hold shift + click to select a range
8c5fd7d
moved previous client to deprecated, added files for new client
daniel-sanche Mar 3, 2023
393860c
added table api skeleton
daniel-sanche Mar 3, 2023
11b3493
fixed missing files
daniel-sanche Mar 3, 2023
1ac0b7a
updated import paths
daniel-sanche Mar 3, 2023
cf981d8
updated unit tests
daniel-sanche Mar 3, 2023
5ea2bc3
updated system tests
daniel-sanche Mar 3, 2023
84fd9c3
ran blacken
daniel-sanche Mar 3, 2023
43b17dd
added additional unimplemented files
daniel-sanche Mar 3, 2023
32e8e45
improved __init__
daniel-sanche Mar 3, 2023
65cb219
stricter type checks
daniel-sanche Mar 3, 2023
c8b8a5a
ran blacken
daniel-sanche Mar 3, 2023
ff835ec
removed sample implementation from BigtableExceptionGroup
daniel-sanche Mar 3, 2023
3207ef5
fixed circular import issues
daniel-sanche Mar 4, 2023
3b3d720
added deprecation warning
daniel-sanche Mar 4, 2023
fa29ba1
updated warning messages
daniel-sanche Mar 4, 2023
4e792a5
fixed lint issues
daniel-sanche Mar 4, 2023
35e8a58
added submodule for gapic fork
daniel-sanche Mar 6, 2023
dfab801
pulled in upstream proto updates
daniel-sanche Mar 6, 2023
72f7d0e
added pooled transport class
daniel-sanche Mar 6, 2023
be3de7a
added functions to get and replace channels in pool
daniel-sanche Mar 6, 2023
f6c7f36
added client init implementation
daniel-sanche Mar 6, 2023
44e76c1
added channel management to table object
daniel-sanche Mar 6, 2023
c4d537e
refactoring channel management
daniel-sanche Mar 7, 2023
99a49a4
ping new channel before replacement
daniel-sanche Mar 7, 2023
bd4fb5e
made channel pool public
daniel-sanche Mar 7, 2023
6b13c29
moved channel refresh logic into shared client
daniel-sanche Mar 8, 2023
60841c5
call ping and warm on register_instance
daniel-sanche Mar 8, 2023
0a3086c
fixed typo
daniel-sanche Mar 8, 2023
08c3c42
renamed function
daniel-sanche Mar 8, 2023
05e10cd
added comments
daniel-sanche Mar 8, 2023
895093f
removed TypeAlis annotation
daniel-sanche Mar 8, 2023
1684274
changed sequence import
daniel-sanche Mar 8, 2023
75d276a
updated warning tests
daniel-sanche Mar 8, 2023
92752c0
updated doc snippet imports
daniel-sanche Mar 8, 2023
79c82c3
updated docs
daniel-sanche Mar 8, 2023
f3b7fbd
disabled coverage for skeleton code
daniel-sanche Mar 8, 2023
aa37a31
fixed cover change
daniel-sanche Mar 8, 2023
17d731f
adjusted coverage setup
daniel-sanche Mar 9, 2023
a0a5c57
ran blacken
daniel-sanche Mar 9, 2023
64a05d8
changed cover value in noxfile
daniel-sanche Mar 9, 2023
741147d
updated fork
daniel-sanche Mar 9, 2023
005900c
added pool transport to tests
daniel-sanche Mar 9, 2023
d58fc74
Merge branch 'fresh-client' into add_new_transport
daniel-sanche Mar 9, 2023
9983e18
fixed issues in tests
daniel-sanche Mar 9, 2023
bfeb546
got gapic tests passing
daniel-sanche Mar 9, 2023
dba7a3c
reworked the client to take instance at init
daniel-sanche Mar 10, 2023
3ae6722
moved background setup back into client init, with warning if no async
daniel-sanche Mar 11, 2023
b9f2b0d
improved comment
daniel-sanche Mar 11, 2023
e483370
Merge branch 'v3' into add_new_transport
daniel-sanche Mar 14, 2023
136a8fe
Merge branch 'main' into v3
daniel-sanche Mar 15, 2023
af86f6b
Merge branch 'v3' into add_new_transport
daniel-sanche Mar 15, 2023
cbe7062
moved instance tracking back into Table
daniel-sanche Mar 16, 2023
909f889
fixed lint issues
daniel-sanche Mar 16, 2023
5efa0ac
fixed blacken
daniel-sanche Mar 16, 2023
2b322e2
Merge branch 'v3' into add_new_transport
daniel-sanche Mar 23, 2023
d4904d7
changed generation arguments
daniel-sanche Mar 23, 2023
6917593
improved close functionality
daniel-sanche Mar 24, 2023
7b5ecbb
updated submodule
daniel-sanche Mar 24, 2023
c0616dd
reverted some generated changes
daniel-sanche Mar 24, 2023
983d4c7
reverted to protoc generation
daniel-sanche Mar 24, 2023
c19658a
got tests passing
daniel-sanche Mar 24, 2023
c2d0da0
added new test file for client
daniel-sanche Mar 25, 2023
8b54a30
set up transport in client
daniel-sanche Mar 25, 2023
0dd981b
implemented tests for underlying transport
daniel-sanche Mar 25, 2023
b0ecd3c
added some tests
daniel-sanche Mar 25, 2023
e47551f
added manage channels tests
daniel-sanche Mar 28, 2023
96d526b
added more tests
daniel-sanche Mar 28, 2023
e997892
client needs active event loop; reorganized tests around that
daniel-sanche Mar 28, 2023
5c86f57
reordered some things
daniel-sanche Mar 28, 2023
3bc4131
simpified client setup
daniel-sanche Mar 28, 2023
3c4e0b6
added test for veneer headers
daniel-sanche Mar 28, 2023
197bf95
added super init test
daniel-sanche Mar 28, 2023
9d8122b
added subclass generator to gapic template
daniel-sanche Mar 28, 2023
2632b70
finished table tests
daniel-sanche Mar 28, 2023
d4e052b
ran blacken
daniel-sanche Mar 28, 2023
1aa694b
got tests working
daniel-sanche Mar 28, 2023
e2d4bd5
fixed type
daniel-sanche Mar 28, 2023
a91362f
reverted rest client
daniel-sanche Mar 28, 2023
d80a8c0
fixed rest tests
daniel-sanche Mar 28, 2023
b888ee8
converted tests to pytest
daniel-sanche Mar 29, 2023
94c1187
added client closure
daniel-sanche Mar 29, 2023
4c02e6c
ran blacken
daniel-sanche Mar 29, 2023
d65b432
use paramaterize in tests
daniel-sanche Mar 29, 2023
4b63d87
improved some tests
daniel-sanche Mar 29, 2023
4ccc421
went back to init without event loop raising warning
daniel-sanche Mar 29, 2023
8001240
removed __del__
daniel-sanche Mar 29, 2023
7c9cea7
changed warning type
daniel-sanche Mar 29, 2023
3bbebea
ran blacken
daniel-sanche Mar 29, 2023
8bff9d0
improved task naming
daniel-sanche Mar 29, 2023
4ae2146
fixed style issues
daniel-sanche Mar 29, 2023
b9dc2f7
got 3.7 tests working
daniel-sanche Mar 30, 2023
19036d8
fixed style issue
daniel-sanche Mar 30, 2023
8a22d15
implement pool as custom grpc channel
daniel-sanche Mar 31, 2023
38e5662
did some restructuring
daniel-sanche Apr 1, 2023
5155800
got some tests working
daniel-sanche Apr 1, 2023
522f7fa
improved tests
daniel-sanche Apr 2, 2023
74029c9
updated template
daniel-sanche Apr 3, 2023
7f2be30
got tests passing
daniel-sanche Apr 3, 2023
2b044ce
removed metadata
daniel-sanche Apr 4, 2023
1743098
added sleep between swwapping and closing channels
daniel-sanche Apr 4, 2023
e5fa4b6
ran blacken
daniel-sanche Apr 4, 2023
8955ec5
got tests working
daniel-sanche Apr 4, 2023
002bc5f
fixed lint issue
daniel-sanche Apr 4, 2023
65f0d2f
fixed tests
daniel-sanche Apr 4, 2023
dbf19c9
holds a gapic client instead of inherits from it
daniel-sanche Apr 5, 2023
9f3e0c5
added comment
daniel-sanche Apr 5, 2023
a0620ea
added random noise to refresh intervals
daniel-sanche Apr 5, 2023
1486d5a
Merge branch 'v3' into add_new_transport
daniel-sanche Apr 6, 2023
28d5a7a
fixed lint issues
daniel-sanche Apr 6, 2023
70fbff9
reduced size of template by making subclass
daniel-sanche Apr 7, 2023
383d8eb
reverted unintentional gapic generation changes
daniel-sanche Apr 7, 2023
018fe03
updated submodule
daniel-sanche Apr 7, 2023
f0403e7
changed warning stack level
daniel-sanche Apr 13, 2023
cb23d32
update docstring
daniel-sanche Apr 19, 2023
bc31ab8
update docstring
daniel-sanche Apr 19, 2023
f54dfde
fix typo
daniel-sanche Apr 19, 2023
46cfc49
docstring improvements
daniel-sanche Apr 19, 2023
573bbd1
made creating table outside loop into error
daniel-sanche Apr 19, 2023
4f2657d
make tables own active instances, and remove instances when tables close
daniel-sanche Apr 19, 2023
59955be
added pool_size and channels as public properties
daniel-sanche Apr 19, 2023
377a8c9
fixed typo
daniel-sanche Apr 19, 2023
8a29898
simplified pooled multicallable
daniel-sanche Apr 20, 2023
50aa5ba
ran blacken
daniel-sanche Apr 20, 2023
42a52a3
associate ids with instances, instead of Table objects
daniel-sanche Apr 20, 2023
50dc608
reverted pooled multicallable changes
daniel-sanche Apr 20, 2023
b116755
pass scopes to created channels
daniel-sanche Apr 21, 2023
ec5eb07
added basic ping system test
daniel-sanche Apr 21, 2023
55cdcc2
keep both the names and ids in table object
daniel-sanche Apr 21, 2023
9e3b411
pull project details out of env vars
daniel-sanche Apr 21, 2023
ab43138
restructured test_client
daniel-sanche Apr 21, 2023
cb1884d
changed how random is mocked
daniel-sanche Apr 21, 2023
9a89d74
ran black
daniel-sanche Apr 21, 2023
1e62c71
added system test code to create instance if not present
daniel-sanche Apr 24, 2023
d70c685
ran black
daniel-sanche Apr 24, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "gapic-generator-fork"]
path = gapic-generator-fork
url = [email protected]:googleapis/gapic-generator-python.git
1 change: 1 addition & 0 deletions gapic-generator-fork
Submodule gapic-generator-fork added at 1a5660
268 changes: 258 additions & 10 deletions google/cloud/bigtable/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,28 @@

from __future__ import annotations

from typing import Any, AsyncIterable, TYPE_CHECKING
from typing import cast, Any, Optional, AsyncIterable, Set, TYPE_CHECKING

from google.cloud.client import ClientWithProject
import asyncio
import grpc
import time
import warnings
import sys

from google.cloud.bigtable_v2.services.bigtable.client import BigtableClientMeta
from google.cloud.bigtable_v2.services.bigtable.async_client import BigtableAsyncClient
from google.cloud.bigtable_v2.services.bigtable.async_client import DEFAULT_CLIENT_INFO
from google.cloud.bigtable_v2.services.bigtable.transports.pooled_grpc_asyncio import (
PooledBigtableGrpcAsyncIOTransport,
)
from google.cloud.client import _ClientProjectMixin
from google.api_core.exceptions import GoogleAPICallError


import google.auth.credentials
import google.auth._default
from google.api_core import client_options as client_options_lib


if TYPE_CHECKING:
from google.cloud.bigtable.mutations import Mutation, BulkMutationsEntry
Expand All @@ -32,7 +48,7 @@
from google.cloud.bigtable.read_modify_write_rules import ReadModifyWriteRule


class BigtableDataClient(ClientWithProject):
class BigtableDataClient(BigtableAsyncClient, _ClientProjectMixin):
def __init__(
self,
*,
Expand All @@ -47,6 +63,8 @@ def __init__(
"""
Create a client instance for the Bigtable Data API

Client must be created within an async run loop context
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved

Args:
project: the project which the client acts on behalf of.
If not passed, falls back to the default inferred
Expand All @@ -62,29 +80,227 @@ def __init__(
Client options used to set user options
on the client. API Endpoint should be set through client_options.
metadata: a list of metadata headers to be attached to all calls with this client
Raises:
- RuntimeError if called outside of an async run loop context
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
- ValueError if pool_size is less than 1
"""
raise NotImplementedError
# set up transport in registry
transport_str = f"pooled_grpc_asyncio_{pool_size}"
transport = PooledBigtableGrpcAsyncIOTransport.with_fixed_size(pool_size)
BigtableClientMeta._transport_registry[transport_str] = transport
Comment on lines +87 to +89
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid making the pool fixed size. It can have negative effects on customers with fluctuating traffic:

  • if the pool size is too small then when customers spike in qps, RPCs will get queued in grpc client
  • if pool size is too big when there isnt a lot of traffic, gRPC channels will get idled and then will cause latency with every connection picked

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any examples of libraries that implement dynamic pooling like you're thinking? What sort of algorithm do you have in mind?

From what I saw, it seems the other libraries generally use fixed pool sizes, so I was hoping to keep it simple for now, and then revisit dynamic sizing after the core client is complete

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can take a look at the java implementation: https://github.com/googleapis/gapic-generator-java/blob/main/gax-java/gax-grpc/src/main/java/com/google/api/gax/grpc/ChannelPool.java#L248

However I think it's ok to have a fix sized pool to start with, just make it extensible so that in the future we can make it dynamic?

# set up client info headers for veneer library
client_info = DEFAULT_CLIENT_INFO
client_info.client_library_version = client_info.gapic_version
# parse client options
if type(client_options) is dict:
client_options = client_options_lib.from_dict(client_options)
client_options = cast(
Optional[client_options_lib.ClientOptions], client_options
)
mixin_args = {"project": project, "credentials": credentials}
# support google-api-core <=1.5.0, which does not have credentials
if "credentials" not in _ClientProjectMixin.__init__.__code__.co_varnames:
mixin_args.pop("credentials")
# initialize client
_ClientProjectMixin.__init__(self, **mixin_args)
# raises RuntimeError if called outside of an async run loop context
BigtableAsyncClient.__init__(
self,
transport=transport_str,
credentials=credentials,
client_options=client_options,
client_info=client_info,
)
self.metadata = metadata or []
# keep track of active instances to for warmup on channel refresh
self._active_instances: Set[str] = set()
# attempt to start background tasks
self._channel_init_time = time.time()
self._channel_refresh_tasks: list[asyncio.Task[None]] = []
try:
self.start_background_channel_refresh()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to await all of the channels to be warm before letting endusers use the client

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is we don't have any instances to warm when the client is created, only when we start creating tables

except RuntimeError:
warnings.warn(
f"{self.__class__.__name__} should be started in an "
"asyncio event loop. Channel refresh will not be started",
RuntimeWarning,
)

def start_background_channel_refresh(self) -> None:
"""
Starts a background task to ping and warm each channel in the pool
Raises:
- RuntimeError if not called in an asyncio event loop
"""
if not self._channel_refresh_tasks:
# raise RuntimeError if there is no event loop
asyncio.get_running_loop()
for channel_idx in range(len(self.transport.channel_pool)):
refresh_task = asyncio.create_task(self._manage_channel(channel_idx))
if sys.version_info >= (3, 8):
mutianf marked this conversation as resolved.
Show resolved Hide resolved
refresh_task.set_name(
f"{self.__class__.__name__} channel refresh {channel_idx}"
)
self._channel_refresh_tasks.append(refresh_task)

@property
def transport(self) -> PooledBigtableGrpcAsyncIOTransport:
"""Returns the transport used by the client instance.
Returns:
BigtableTransport: The transport used by the client instance.
"""
return cast(PooledBigtableGrpcAsyncIOTransport, self._client.transport)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be exposed to end users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the Client spec used by all GCP libraries. The reason I override it here was to change the type annotation


async def close(self, timeout: float = 2.0):
"""
Cancel all background tasks
"""
for task in self._channel_refresh_tasks:
task.cancel()
group = asyncio.gather(*self._channel_refresh_tasks, return_exceptions=True)
await asyncio.wait_for(group, timeout=timeout)
await self.transport.close()
self._channel_refresh_tasks = []

async def __aexit__(self, exc_type, exc_val, exc_tb):
"""
Cleanly close context manager on exit
"""
await self.close()

async def _ping_and_warm_instances(
self, channel: grpc.aio.Channel
) -> list[GoogleAPICallError | None]:
"""
Prepares the backend for requests on a channel

Pings each Bigtable instance registered in `_active_instances` on the client

Args:
channel: grpc channel to ping
Returns:
- squence of results or exceptions from the ping requests
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
"""
ping_rpc = channel.unary_unary(
"/google.bigtable.v2.Bigtable/PingAndWarmChannel"
)
tasks = [ping_rpc({"name": n}) for n in self._active_instances]
return await asyncio.gather(*tasks, return_exceptions=True)

async def _manage_channel(
self,
channel_idx: int,
refresh_interval: float = 60 * 45,
grace_period: float = 60 * 15,
) -> None:
"""
Background coroutine that periodically refreshes and warms a grpc channel

The backend will automatically close channels after 60 minutes, so
`refresh_interval` + `grace_period` should be < 60 minutes

Runs continuously until the client is closed

Args:
channel_idx: index of the channel in the transport's channel pool
refresh_interval: interval before initiating refresh process in seconds
grace_period: time to allow previous channel to serve existing
requests before closing, in seconds
"""
first_refresh = self._channel_init_time + refresh_interval
mutianf marked this conversation as resolved.
Show resolved Hide resolved
next_sleep = max(first_refresh - time.time(), 0)
if next_sleep > 0:
# warm the current channel immediately
channel = self.transport.channel_pool[channel_idx]
await self._ping_and_warm_instances(channel)
# continuously refresh the channel every `refresh_interval` seconds
while True:
await asyncio.sleep(next_sleep)
# prepare new channel for use
new_channel = self.transport.create_channel(
self.transport._host,
credentials=self.transport._credentials,
scopes=self.transport._scopes,
ssl_credentials=self.transport._ssl_channel_credentials,
quota_project_id=self.transport._quota_project_id,
options=[
("grpc.max_send_message_length", -1),
("grpc.max_receive_message_length", -1),
],
)
await self._ping_and_warm_instances(new_channel)
# cycle channel out of use, with long grace window before closure
start_timestamp = time.time()
await self.transport.replace_channel(channel_idx, grace_period, new_channel)
# subtract the time spent waiting for the channel to be replaced
next_sleep = refresh_interval - (time.time() - start_timestamp)

async def register_instance(self, instance_id: str):
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
"""
Registers an instance with the client, and warms the channel pool
for the instance
The client will periodically refresh grpc channel pool used to make
requests, and new channels will be warmed for each registered instance
Channels will not be refreshed unless at least one instance is registered
"""
instance_name = self.instance_path(self.project, instance_id)
if instance_name not in self._active_instances:
self._active_instances.add(instance_name)
if self._channel_refresh_tasks:
# refresh tasks already running
# call ping and warm on all existing channels
for channel in self.transport.channel_pool:
await self._ping_and_warm_instances(channel)
else:
# refresh tasks aren't active. start them as background tasks
self.start_background_channel_refresh()

async def remove_instance_registration(self, instance_id: str) -> bool:
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
"""
Removes an instance from the client's registered instances, to prevent
warming new channels for the instance

If instance_id is not registered, returns False

Args:
instance_id: id of the instance to remove
Returns:
- True if instance was removed
"""
instance_name = self.instance_path(self.project, instance_id)
try:
self._active_instances.remove(instance_name)
return True
except KeyError:
return False

def get_table(
self, instance_id: str, table_id: str, app_profile_id: str | None = None
self,
instance_id: str,
table_id: str,
app_profile_id: str | None = None,
metadata: list[tuple[str, str]] | None = None,
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
) -> Table:
"""
Return a Table instance to make API requests for a specific table.
Returns a table instance for making data API requests

Args:
instance_id: The ID of the instance that owns the table.
instance_id: The Bigtable instance ID to associate with this client
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
instance_id is combined with the client's project to fully
specify the instance
table_id: The ID of the table.
app_profile_id: (Optional) The app profile to associate with requests.
https://cloud.google.com/bigtable/docs/app-profiles
metadata: a list of metadata headers to be attached to all calls with this client
"""
raise NotImplementedError
return Table(self, instance_id, table_id, app_profile_id, metadata)


class Table:
"""
Main Data API surface

Table object maintains instance_id, table_id, and app_profile_id context, and passes them with
Table object maintains table_id, and app_profile_id context, and passes them with
each call
"""

Expand All @@ -94,8 +310,40 @@ def __init__(
instance_id: str,
table_id: str,
app_profile_id: str | None = None,
metadata: list[tuple[str, str]] | None = None,
):
raise NotImplementedError
"""
Initialize a Table instance

Must be created within an async run loop context
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved

Args:
instance_id: The Bigtable instance ID to associate with this client
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
instance_id is combined with the client's project to fully
specify the instance
table_id: The ID of the table.
app_profile_id: (Optional) The app profile to associate with requests.
https://cloud.google.com/bigtable/docs/app-profiles
metadata: a list of metadata headers to be attached to all calls with this client
Raises:
- RuntimeError if called outside of an async run loop context
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
"""
self.client = client
self.instance = instance_id
self.table_id = table_id
self.app_profile_id = app_profile_id
self.metadata = metadata or []
# raises RuntimeError if called outside of an async run loop context
daniel-sanche marked this conversation as resolved.
Show resolved Hide resolved
try:
self._register_instance_task = asyncio.create_task(
self.client.register_instance(instance_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there a deregister call as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. This task is needed because we can't call await from __init__, so we schedule it as a background task instead

)
except RuntimeError:
warnings.warn(
"Table should be created in an asyncio event loop."
" Instance will not be registered with client for refresh",
RuntimeWarning,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not bubble up this error?

Copy link
Contributor Author

@daniel-sanche daniel-sanche Apr 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone back and forth on it

The question comes down to whether we should support creating these classes outside async context:

  • if there's not an active event loop, we can't start our background channel management coroutines at init time, but everything else should work
  • I don't think it's a best practice to enforce only initializing objects with an active event loop, so I opted to raise a warning instead. Users can always manually start the coroutines in an async context later
  • That said, I'm not sure what the use-cases are for initializing outside of an async context, so I'm not sure how important this is to support
  • I've considered using a background thread for channel management, which should avoid this problem, but I don't think that's possible due to grpc channel limitations. And it would definitely be cleaner if we could only use one concurrency method here as well

Let me know if you have thoughts


async def read_rows_stream(
self,
Expand Down
15 changes: 8 additions & 7 deletions google/cloud/bigtable_v2/services/bigtable/async_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -807,8 +807,8 @@ async def ping_and_warm(

Args:
request (Optional[Union[google.cloud.bigtable_v2.types.PingAndWarmRequest, dict]]):
The request object. Request message for client
connection keep-alive and warming.
The request object. Request message for client connection
keep-alive and warming.
name (:class:`str`):
Required. The unique name of the instance to check
permissions for as well as respond. Values are of the
Expand Down Expand Up @@ -1027,8 +1027,9 @@ def generate_initial_change_stream_partitions(

Args:
request (Optional[Union[google.cloud.bigtable_v2.types.GenerateInitialChangeStreamPartitionsRequest, dict]]):
The request object. NOTE: This API is intended to be
used by Apache Beam BigtableIO. Request message for
The request object. NOTE: This API is intended to be used
by Apache Beam BigtableIO. Request
message for
Bigtable.GenerateInitialChangeStreamPartitions.
table_name (:class:`str`):
Required. The unique name of the table from which to get
Expand Down Expand Up @@ -1126,9 +1127,9 @@ def read_change_stream(

Args:
request (Optional[Union[google.cloud.bigtable_v2.types.ReadChangeStreamRequest, dict]]):
The request object. NOTE: This API is intended to be
used by Apache Beam BigtableIO. Request message for
Bigtable.ReadChangeStream.
The request object. NOTE: This API is intended to be used
by Apache Beam BigtableIO. Request
message for Bigtable.ReadChangeStream.
table_name (:class:`str`):
Required. The unique name of the table from which to
read a change stream. Values are of the form
Expand Down
5 changes: 5 additions & 0 deletions google/cloud/bigtable_v2/services/bigtable/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@
from .transports.base import BigtableTransport, DEFAULT_CLIENT_INFO
from .transports.grpc import BigtableGrpcTransport
from .transports.grpc_asyncio import BigtableGrpcAsyncIOTransport
from .transports.pooled_grpc_asyncio import PooledBigtableGrpcAsyncIOTransport
from .transports.rest import BigtableRestTransport


Expand All @@ -67,6 +68,7 @@ class BigtableClientMeta(type):
_transport_registry = OrderedDict() # type: Dict[str, Type[BigtableTransport]]
_transport_registry["grpc"] = BigtableGrpcTransport
_transport_registry["grpc_asyncio"] = BigtableGrpcAsyncIOTransport
_transport_registry["pooled_grpc_asyncio"] = PooledBigtableGrpcAsyncIOTransport
_transport_registry["rest"] = BigtableRestTransport

def get_transport_class(
Expand Down Expand Up @@ -380,6 +382,9 @@ def __init__(
transport (Union[str, BigtableTransport]): The
transport to use. If set to None, a transport is chosen
automatically.
NOTE: "rest" transport functionality is currently in a
beta state (preview). We welcome your feedback via an
issue in this library's source repository.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to opt out of rest for the data client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe it's configured in the yaml somewhere. We can address at that in a future PR

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igorbernstein2 Can you say more about that? We have gotten some customer feedback in other product areas that REST transport is important to them in some use cases, so we've typically preferred to include the option when we can and let users choose which they require with sensible defaults. Are you suggesting we disallow it entirely? (I see the previous cl/458022604 where this was disallowed, but I'm wondering if those are still relevant.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@meredithslota the rest transport doesn't support asyncio, so I think it makes sense to hard-code the transport to use our pooled transport for at least the async layer.

When it comes time to add the synchronous layer, we could support rest if customers find it useful, but it just wouldn't be able to make use of the same channel management optimizations that we're adding to grpc side. Maybe @igorbernstein2 has some more context though

client_options (Optional[Union[google.api_core.client_options.ClientOptions, dict]]): Custom options for the
client. It won't take effect if a ``transport`` instance is provided.
(1) The ``api_endpoint`` property can be used to override the
Expand Down
Loading