From e9e16a33854d9d635600e5a12bf471f451f2c711 Mon Sep 17 00:00:00 2001 From: Carlos de la Guardia Date: Tue, 25 Jun 2019 03:23:46 -0500 Subject: [PATCH 1/6] initial version of migration guide --- MIGRATION_NOTES.md | 9 +- docs/index.rst | 1 + docs/migrating.rst | 225 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 229 insertions(+), 6 deletions(-) create mode 100644 docs/migrating.rst diff --git a/MIGRATION_NOTES.md b/MIGRATION_NOTES.md index 88a255f0..1a5dfa28 100644 --- a/MIGRATION_NOTES.md +++ b/MIGRATION_NOTES.md @@ -142,15 +142,12 @@ with context as client.context(): strings (entity_pb2.Value.string_value). At read time, a `StringProperty` will accept either a string or blob value, so compatibility is maintained with legacy databases. -- Instances of google.appengine.datastore.datastore_query.Order have been - replaced by a simple list of field names for ordering. -- The QueryOptions class from google.cloud.ndb.query, has been reimplemented, +- The QueryOptions class from google.cloud.ndb.query, has been reimplemented, since google.appengine.datastore.datastore_rpc.Configuration is no longer available. It still uses the same signature, but does not support original Configuration methods. -- Because google.appengine.datastore.datastore_query.Order is no longer - available, the `order` parameter for the query.Query constructor has been - replaced by a list or tuple. +- Because google.appengine.datastore.datastore_query.Order is no longer + available, the ndb.query.PropertyOrder class has been created to replace it. - Transaction propagation is no longer supported. This was a feature of the older Datastore RPC library which is no longer used. Starting a new transaction when a transaction is already in progress in the current context diff --git a/docs/index.rst b/docs/index.rst index c98c661c..7bec793e 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -19,6 +19,7 @@ blobstore metadata stats + migrating This is a Python 3 version of the `ndb` client library for use with `Google Cloud Datastore `_. diff --git a/docs/migrating.rst b/docs/migrating.rst new file mode 100644 index 00000000..9058b2d9 --- /dev/null +++ b/docs/migrating.rst @@ -0,0 +1,225 @@ +###################################### +Migrating from Python 2 version of NDB +###################################### + +While every attempt has been made to keep compatibilty with the previous +version of `ndb`, there are fundamental differences at the platform level, +which have made necessary in some cases to depart from the original +implementation, and sometimes even to remove exisitng functionality +altogether. + +Because one of the main objectives of this rewrite was to be able to use `ndb` +independently from Google App Engine, the legacy APIs from GAE cannot be +depended upon. Also, any environment and runtime variables and resources will +not be available when running outside of GAE. This means that many `ndb` APIs +that depended on GAE have been changed, and many APIs that accessed GAE +resources directly have been dropped. + +Aside from this, there are many differences between the Datastore APIs +provided by GAE and those provided by the newer Google Cloud Platform. These +diffeences have required some code and API changes as well. + +Finally, in many cases, new features of Python 3 have eliminated the need for +some code, particularly from the old `utils` module. + +If you are migrating code, these changes can generate some confusion. This +document will cover the most common migration issues. + +Setting up a connection +======================= + +The most important difference from the previous `ndb` version, is that the new +`ndb` requires the use of a client to set up a runtime context for a project. +This is necessary because `ndb` can now be used in any Python environment, so +we can no longer assume it's running in the context of a GAE request. + +The `ndb` client uses ``google.auth`` for authentication, which is how APIs in +Google Cloud Platform work. The client can take a `credentials` parameter or +get the credentials using the `GOOGLE_APPLCATION_CREDENTIALS` environment +variable, which is the recommended option. + +After instantiating a client, it's necessary to establish a runtime context, +using the ``Client.context`` method. All interactions with the database must +be within the context obtained from this call:: + + from google.cloud import ndb + + client = ndb.Client() + + with context as client.context(): + do_something_with_ndb() + +Note that the example above is assumming the google credentials are set in +the environment. + +Keys +==== + +There are some methods from the ``key`` module that are not implemented in +this version of `ndb`: + + - Key.from_old_key. + - Key.to_old_key. + +Properties +========== + +There are various small changes in some of the model properties that might +trip you up when migrating code. Here are some of them, for quick reference: + +- The `BlobProperty` constructor only sets `_compressed` if explicitly + passed. The original set `_compressed` always. +- In the exact same fashion the `JsonProperty` constructor only sets + `_json_type` if explicitly passed.] +- Similarly, the `DateTimeProperty` constructor only sets `_auto_now` and + `_auto_now_add` if explicitly passed. +- `TextProperty(indexed=True)` and `StringProperty(indexed=False)` are no + longer supported. +- The `Property()` constructor (and subclasses) originally accepted both + `unicode` and `str` (the Python 2 versions) for `name` (and `kind`) but now + only accept `str`. + +QueryOptions and Query Order +============================ + +The QueryOptions class from ``google.cloud.ndb.query``, has been reimplemented, +since ``google.appengine.datastore.datastore_rpc.Configuration`` is no longer +available. It still uses the same signature, but does not support original +Configuration methods. + +Similarly,b ecause ``google.appengine.datastore.datastore_query.Order`` is no +longer available, the ``ndb.query.PropertyOrder`` class has been created to +replace it. + +MessageProperty and EnumProperty +================================ + +These properties, from the ``ndb.msgprop`` module, depend on the Google +Protocol RPC Library, or `protorpc`, which is not an `ndb` dependency. For +this reason, they are not part of this version of `ndb`. + +Tasklets +======== + +When writing a `tasklet`, it is no longer necessary to raise a Return +exception for returning the result. A normal return can be used instead:: + + @ndb.tasklet + def get_cart(): + cart = yield CartItem.query().fetch_async() + return cart + +Note that "raise Return(cart)" can still be used, but it's not recommended. + +There are some methods from the ``tasklet`` module that are not implemented in +this version of `ndb`, mainly because of changes in how an `ndb` context is +created and used in this version: + + - add_flow_exception. + - make_context. + - make_default_context. + - QueueFuture. + - ReducedFuture. + - SerialQueueFuture. + - set_context. + - toplevel. + +Utils +===== + +The previous version of `ndb` included an ``ndb.utils`` module, which defined +a number of methods that were mostly used internally. Some of those have been +made obsolete by new Python 3 features, while others have been discarded due +to implementation differences in the new `ndb`. + +Possibly the most used utility from this module outside of `ndb` code, is the +``positional`` decorator, which declares that only the first `n` arguments of +a function or method may be positional. Python 3 can do this using keyword-only +arguments. What used to be written as:: + + @utils.positional(2) + def function1(arg1, arg2, arg3=None, arg4=None) + pass + +Will be written like this in the new version:: + + def function1(arg1, arg2, *, arg3=None, arg4=None) + pass + +Exceptions +========== + +App Engine's legacy exceptions are no longer available, but `ndb` provides +shims for most of them, which can be imported from the `ndb.exceptions` +package, like this:: + + from ndb.exceptioms import BadRequestError, BadArgumentError + +Datastore API +============= + +There are many differences bewteen the current Datastore API and the legacy App +Engine Datastore. In most cases, where the public API was generally used, this +should not be a problem. However, if you relied in your code on the private +Datastore API, the code that does this will probably need to be rewritten. +Specifically, any function or method that dealt directly with protocol buffers +will no longer work. The Datastore `.protobuf` definitions have changed +significantly from the public API used by App Engine to the current published +API. Additionally, this version of NDB mostly delegates to +`google.cloud.datastore` for parsing data returned by RPCs, which is a +significant internal refactoring. + +Default Namespace +================= + +In the previous version, ``google.appengine.api.namespacemanager`` was used +to determine the default namespace when not passed in to constructors which +require it, like ``Key``. In this version, the client class can be instantiated +with a namespace, which will be used as the default whenever it's not included +in the constuctor or method arguments that expect a namespace:: + + from google.cloud import ndb + + client=ndb.Client(namespace="my namespace") + + with context as client.context(): + key = ndb.Key("SomeKind", "SomeId") + +In this example, the key will be created under the namespace `my namespace`, +because that's the namespace passed in when setting up the client. + +Django Middleware +================= + +The Django middleware that was part of the GAE version of `ndb` has been +discontinued and is no longer available in current `ndb`. The middleware +basically took care of setting the context, which can be accomplished on +modern Django with a simple class middleware, similar to this:: + + from google.cloud import ndb + + class NDBMiddleware(object): + def __init__(self, get_response): + self.get_response = get_response + client = ndb.Client() + self.ndb_context = client.context() + + def __call__(self, request): + request.ndb_context = self.ndb_context + response = self.get_response(request) + return response + +The ``__init__`` method is called only once, during server start, so it's a +good place to create and store an `ndb` context. The ``__call__`` method will +be called once for every request, so we add our ndb context to the request +there, before the response is processed. The context will then be available in +view and template code. + +Another way to get an `ndb` context into a request, would be to use a `context +processor`, but those are functions called for every request, which means we +would need to initialize the client and context on each request, or find +another way to initialize and get the initial context. + +Note that the above code, like other `ndb` code, assumes the presence of the +`GOOGLE_APPLCATION_CREDENTIALS` environment variable when the client is +created. See Django documentation for details on setting up the environment. From 7618c0d42fef49d148f58fff2d26e9425132af84 Mon Sep 17 00:00:00 2001 From: Carlos de la Guardia Date: Tue, 25 Jun 2019 03:30:30 -0500 Subject: [PATCH 2/6] minor update --- docs/migrating.rst | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/docs/migrating.rst b/docs/migrating.rst index 9058b2d9..73759b28 100644 --- a/docs/migrating.rst +++ b/docs/migrating.rst @@ -61,6 +61,16 @@ this version of `ndb`: - Key.from_old_key. - Key.to_old_key. +Models +====== + +There are some methods from the ``model`` module that are not implemented in +this version of `ndb`. This is because getting the indexes relied on GAE +context functionality: + + - get_indexes. + - get_indexes_async. + Properties ========== From a685ec76badf9e8398448ff2e18781f1c5552442 Mon Sep 17 00:00:00 2001 From: Carlos de la Guardia Date: Thu, 27 Jun 2019 21:17:55 -0500 Subject: [PATCH 3/6] fix typos --- docs/migrating.rst | 18 +++++++++--------- docs/spelling_wordlist.txt | 4 ++++ 2 files changed, 13 insertions(+), 9 deletions(-) diff --git a/docs/migrating.rst b/docs/migrating.rst index 73759b28..d8d8dbb5 100644 --- a/docs/migrating.rst +++ b/docs/migrating.rst @@ -2,10 +2,10 @@ Migrating from Python 2 version of NDB ###################################### -While every attempt has been made to keep compatibilty with the previous +While every attempt has been made to keep compatibility with the previous version of `ndb`, there are fundamental differences at the platform level, which have made necessary in some cases to depart from the original -implementation, and sometimes even to remove exisitng functionality +implementation, and sometimes even to remove existing functionality altogether. Because one of the main objectives of this rewrite was to be able to use `ndb` @@ -17,7 +17,7 @@ resources directly have been dropped. Aside from this, there are many differences between the Datastore APIs provided by GAE and those provided by the newer Google Cloud Platform. These -diffeences have required some code and API changes as well. +differences have required some code and API changes as well. Finally, in many cases, new features of Python 3 have eliminated the need for some code, particularly from the old `utils` module. @@ -49,7 +49,7 @@ be within the context obtained from this call:: with context as client.context(): do_something_with_ndb() -Note that the example above is assumming the google credentials are set in +Note that the example above is assuming the google credentials are set in the environment. Keys @@ -97,7 +97,7 @@ since ``google.appengine.datastore.datastore_rpc.Configuration`` is no longer available. It still uses the same signature, but does not support original Configuration methods. -Similarly,b ecause ``google.appengine.datastore.datastore_query.Order`` is no +Similarly, because ``google.appengine.datastore.datastore_query.Order`` is no longer available, the ``ndb.query.PropertyOrder`` class has been created to replace it. @@ -134,8 +134,8 @@ created and used in this version: - set_context. - toplevel. -Utils -===== +ndb.utils +========= The previous version of `ndb` included an ``ndb.utils`` module, which defined a number of methods that were mostly used internally. Some of those have been @@ -168,7 +168,7 @@ package, like this:: Datastore API ============= -There are many differences bewteen the current Datastore API and the legacy App +There are many differences between the current Datastore API and the legacy App Engine Datastore. In most cases, where the public API was generally used, this should not be a problem. However, if you relied in your code on the private Datastore API, the code that does this will probably need to be rewritten. @@ -186,7 +186,7 @@ In the previous version, ``google.appengine.api.namespacemanager`` was used to determine the default namespace when not passed in to constructors which require it, like ``Key``. In this version, the client class can be instantiated with a namespace, which will be used as the default whenever it's not included -in the constuctor or method arguments that expect a namespace:: +in the constructor or method arguments that expect a namespace:: from google.cloud import ndb diff --git a/docs/spelling_wordlist.txt b/docs/spelling_wordlist.txt index e1f311b5..c883c82e 100644 --- a/docs/spelling_wordlist.txt +++ b/docs/spelling_wordlist.txt @@ -6,6 +6,7 @@ Appengine appengine Args args +async auth backend Blobstore @@ -62,6 +63,7 @@ prefetch protobuf proxied QueryOptions +reimplemented RequestHandler runtime schemas @@ -80,6 +82,7 @@ tasklet Tasklets tasklets timestamp +toplevel Transactionally unary unicode @@ -89,6 +92,7 @@ unpickling urlsafe username UTF +utils webapp websafe validator From 3138a6425fe1931e2f6e121f6f944215fcfc91f5 Mon Sep 17 00:00:00 2001 From: Carlos de la Guardia Date: Mon, 1 Jul 2019 22:10:53 -0500 Subject: [PATCH 4/6] review fixes and contributions --- docs/migrating.rst | 91 +++++++++++++++++++++++++++++--------- docs/spelling_wordlist.txt | 2 + 2 files changed, 71 insertions(+), 22 deletions(-) diff --git a/docs/migrating.rst b/docs/migrating.rst index d8d8dbb5..3c0d77e0 100644 --- a/docs/migrating.rst +++ b/docs/migrating.rst @@ -8,12 +8,10 @@ which have made necessary in some cases to depart from the original implementation, and sometimes even to remove existing functionality altogether. -Because one of the main objectives of this rewrite was to be able to use `ndb` -independently from Google App Engine, the legacy APIs from GAE cannot be -depended upon. Also, any environment and runtime variables and resources will -not be available when running outside of GAE. This means that many `ndb` APIs -that depended on GAE have been changed, and many APIs that accessed GAE -resources directly have been dropped. +One of the main objectives of this rewrite was to enable `ndb` for use in any +Python environment, not just Google App Engine. As a result, many of the `ndb` +APIs that relied on GAE environment and runtime variables, resources, and +legacy APIs have been dropped. Aside from this, there are many differences between the Datastore APIs provided by GAE and those provided by the newer Google Cloud Platform. These @@ -46,10 +44,37 @@ be within the context obtained from this call:: client = ndb.Client() - with context as client.context(): + with client.context() as context: do_something_with_ndb() -Note that the example above is assuming the google credentials are set in +The context is not thread safe, so for threaded applications, you need to +generate one context per thread. This is particularly important for web +applications, where the best practice would be to generate a context per +request. + +The following code shows how to use the context in a threaded application:: + + import threading + from google.cloud import datastore + from google.cloud import ndb + + client = ndb.Client() + + class Test(ndb.Model): + name = ndb.StringProperty() + + def insert(input_name): + with client.context(): + t = Test(name=input_name) + t.put() + + thread1 = threading.Thread(target=insert, args=['John']) + thread2 = threading.Thread(target=insert, args=['Bob']) + + thread1.start() + thread2.start() + +Note that the examples above are assuming the google credentials are set in the environment. Keys @@ -61,6 +86,9 @@ this version of `ndb`: - Key.from_old_key. - Key.to_old_key. +These methods were used to pass keys to and from the db Datastore API, which is +no longer supported. + Models ====== @@ -80,11 +108,12 @@ trip you up when migrating code. Here are some of them, for quick reference: - The `BlobProperty` constructor only sets `_compressed` if explicitly passed. The original set `_compressed` always. - In the exact same fashion the `JsonProperty` constructor only sets - `_json_type` if explicitly passed.] + `_json_type` if explicitly passed. - Similarly, the `DateTimeProperty` constructor only sets `_auto_now` and `_auto_now_add` if explicitly passed. - `TextProperty(indexed=True)` and `StringProperty(indexed=False)` are no - longer supported. + longer supported. That is, TextProperty can no longer be indexed, whereas + StringProperty is always indexed. - The `Property()` constructor (and subclasses) originally accepted both `unicode` and `str` (the Python 2 versions) for `name` (and `kind`) but now only accept `str`. @@ -156,6 +185,9 @@ Will be written like this in the new version:: def function1(arg1, arg2, *, arg3=None, arg4=None) pass +Note that this could change if Python 2.7 support is added at some point, which +is still a possibility. + Exceptions ========== @@ -163,7 +195,7 @@ App Engine's legacy exceptions are no longer available, but `ndb` provides shims for most of them, which can be imported from the `ndb.exceptions` package, like this:: - from ndb.exceptioms import BadRequestError, BadArgumentError + from ndb.exceptions import BadRequestError, BadArgumentError Datastore API ============= @@ -172,7 +204,18 @@ There are many differences between the current Datastore API and the legacy App Engine Datastore. In most cases, where the public API was generally used, this should not be a problem. However, if you relied in your code on the private Datastore API, the code that does this will probably need to be rewritten. -Specifically, any function or method that dealt directly with protocol buffers + +Specifically, the old NDB library included some undocumented APIs that dealt +directly with Datastore protocol buffers. These APIs will no longer work. +Rewrite any code that used the following classes, properties, or methods: + + - ModelAdapter + - Property._db_get_value, Property._db_set_value. + - Property._db_set_compressed_meaning and + Property._db_set_uncompressed_meaning. + - Model._deserialize and Model._serialize. + - model.make_connection. + will no longer work. The Datastore `.protobuf` definitions have changed significantly from the public API used by App Engine to the current published API. Additionally, this version of NDB mostly delegates to @@ -192,7 +235,7 @@ in the constructor or method arguments that expect a namespace:: client=ndb.Client(namespace="my namespace") - with context as client.context(): + with client.context() as context: key = ndb.Key("SomeKind", "SomeId") In this example, the key will be created under the namespace `my namespace`, @@ -211,24 +254,28 @@ modern Django with a simple class middleware, similar to this:: class NDBMiddleware(object): def __init__(self, get_response): self.get_response = get_response - client = ndb.Client() - self.ndb_context = client.context() + self.client = ndb.Client() def __call__(self, request): - request.ndb_context = self.ndb_context - response = self.get_response(request) + context = self.client.context() + request.ndb_context = context + with context: + response = self.get_response(request) return response The ``__init__`` method is called only once, during server start, so it's a -good place to create and store an `ndb` context. The ``__call__`` method will -be called once for every request, so we add our ndb context to the request -there, before the response is processed. The context will then be available in -view and template code. +good place to create and store an `ndb` client. As mentioned above, the +recommended practice is to have one context per request, so the ``__call__`` +method, which is called once per request, is an ideal place to create it. +After we have the context, we add it to the request, right before the response +is processed. The context will then be available in view and template code. +Finally, we use the ``with`` statement to generate the response within our +context. Another way to get an `ndb` context into a request, would be to use a `context processor`, but those are functions called for every request, which means we would need to initialize the client and context on each request, or find -another way to initialize and get the initial context. +another way to initialize and get the initial client. Note that the above code, like other `ndb` code, assumes the presence of the `GOOGLE_APPLCATION_CREDENTIALS` environment variable when the client is diff --git a/docs/spelling_wordlist.txt b/docs/spelling_wordlist.txt index c883c82e..0efb69b5 100644 --- a/docs/spelling_wordlist.txt +++ b/docs/spelling_wordlist.txt @@ -17,6 +17,7 @@ builtin composable Datastore datastore +deserialize deserialized Dict Django @@ -68,6 +69,7 @@ RequestHandler runtime schemas stackable +StringProperty subattribute subclassed subclasses From 38d40a70ac857b84c0c656a82a3042c2e5dfb480 Mon Sep 17 00:00:00 2001 From: Carlos de la Guardia Date: Wed, 3 Jul 2019 19:32:47 -0500 Subject: [PATCH 5/6] quick fixes after second review --- docs/migrating.rst | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/docs/migrating.rst b/docs/migrating.rst index 3c0d77e0..f9f66c5a 100644 --- a/docs/migrating.rst +++ b/docs/migrating.rst @@ -86,8 +86,8 @@ this version of `ndb`: - Key.from_old_key. - Key.to_old_key. -These methods were used to pass keys to and from the db Datastore API, which is -no longer supported. +These methods were used to pass keys to and from the `db` Datastore API, which +is no longer supported (`db` was `ndb`'s predecessor). Models ====== @@ -161,7 +161,6 @@ created and used in this version: - ReducedFuture. - SerialQueueFuture. - set_context. - - toplevel. ndb.utils ========= @@ -216,12 +215,6 @@ Rewrite any code that used the following classes, properties, or methods: - Model._deserialize and Model._serialize. - model.make_connection. -will no longer work. The Datastore `.protobuf` definitions have changed -significantly from the public API used by App Engine to the current published -API. Additionally, this version of NDB mostly delegates to -`google.cloud.datastore` for parsing data returned by RPCs, which is a -significant internal refactoring. - Default Namespace ================= From 29d35f9e2a89c10f5509a3f574fba49410fb591d Mon Sep 17 00:00:00 2001 From: Carlos de la Guardia Date: Wed, 10 Jul 2019 23:13:39 -0500 Subject: [PATCH 6/6] more quick fixes --- docs/migrating.rst | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/docs/migrating.rst b/docs/migrating.rst index f9f66c5a..daec6728 100644 --- a/docs/migrating.rst +++ b/docs/migrating.rst @@ -31,10 +31,12 @@ The most important difference from the previous `ndb` version, is that the new This is necessary because `ndb` can now be used in any Python environment, so we can no longer assume it's running in the context of a GAE request. -The `ndb` client uses ``google.auth`` for authentication, which is how APIs in -Google Cloud Platform work. The client can take a `credentials` parameter or -get the credentials using the `GOOGLE_APPLCATION_CREDENTIALS` environment -variable, which is the recommended option. +The `ndb` client uses ``google.auth`` for authentication, consistent with other +Google Cloud Platform client libraries. The client can take a `credentials` +parameter or get the credentials using the `GOOGLE_APPLCATION_CREDENTIALS` +environment variable, which is the recommended option. For more information +about authentication, consult the `Cloud Storage Client Libraries +`_ documentation. After instantiating a client, it's necessary to establish a runtime context, using the ``Client.context`` method. All interactions with the database must @@ -50,7 +52,9 @@ be within the context obtained from this call:: The context is not thread safe, so for threaded applications, you need to generate one context per thread. This is particularly important for web applications, where the best practice would be to generate a context per -request. +request. However, please note that for cases where multiple threads are used +for a single request, a new context should be generated for every thread that +will use the `ndb` library. The following code shows how to use the context in a threaded application::