-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check _RAW_BSON_DOCUMENT_MARKER at CodecOptions.document_class first #324
Conversation
Off the top of my head I can't think of a problem with re-ordering those checks, but the bson module instantiates the document class, so document_class can't be an instance. Can you clarify what you're trying to do? |
If you mean
I'm writing an ORM for mongo, that uses pymongo as a driver. I want to wrap my decoding results with data structures other than {
// Several levels of metadata (like cursor id, namespace, request result) wrapping my real data
"cursor": {
"firstBatch": [
// Here the real data starts
// First document
{
"_id": ObjectId("59256c8a5e49a721938935a3"),
"str_field": "Hello",
"int_field": 42,
},
// Second document
{
"_id": ObjectId("59256c8a5e49a721938935a4"),
"str_field": "Hello world",
"int_field": 24,
}
// Real data ended, now some more metadata
],
"id": 0,
"ns": "test_db.test_col"
},
"ok": 1
} Lets say, I want to wrap my document with class called {
# its a dict
"cursor": {
"firstBatch": [
<__main__.Model with _id=59256c8a5e49a721938935a3>,
<__main__.Model with _id=59256c8a5e49a721938935a4>
],
"id": 0,
"ns": "test_db.test_col"
},
"ok": 1
} My idea is to use custom BSONDecoder, that also has some context between import bson
from bson.codec_options import _RAW_BSON_DOCUMENT_MARKER
class Model(object):
# Model class implementation
pass
class CustomBSONDecoder(object):
_type_marker = _RAW_BSON_DOCUMENT_MARKER
def __init__(self, model_class):
self.model_class = model_class # It stores my custom decoding type
self.depth = 0 # This is the depth of a step
def __call__(self, data, codec_options):
object_size = bson._UNPACK_INT(data[:4])[0] - 1
if self.depth == 2:
# If the depth is 2, we are at the document data level
container = self.model_class()
else:
# Else use regular dictionary
container = dict()
self.depth += 1
for key, value, pos in bson._iterate_elements(data, 4, object_size, codec_options):
container[key] = value
self.depth -= 1
return container.unwrap() if do_wrap else container
codec_options = CodecOptions(document_class=CustomBSONDecoder(Model)) If |
Sorry, I still don't understand how this pull request solves your problem.
This change doesn't allow document_class to be an instance. The issubclass call will still fail if the class doesn't inherit from RawBSONDocument, and an instance of RawBSONDocument isn't callable:
|
It turns this: {
"foo": "bar",
"nested": {
"data": [
{"a": 1, "b": 2},
{"a": "Hello", "b": "world!"},
]
}
} To this (note {
'foo': 'bar',
'nested': {
'data': [
<Model a=1, b=2>,
<Model a=Hello, b=world!>
]
}
} |
@behackett this PR will fix my current problem (passing a class instance to BSON decoder). I think there could be a further step: to treat |
Thanks for your contribution, and patience waiting for review. Though the change doesn't appear to make any practical difference to existing applications, it does mean we can never change the internals of raw document support without breaking your application. Raw documents are meant to get raw BSON into and out of MongoDB. Using them to implement a codec mapping is a clever hack, but it's still a hack. Historically we've provided son manipulators to achieve your goal, but they are currently deprecated. We intend to deliver a replacement solution in a future release. That work is currently slated for the next release, but I can't guarantee it will make it. |
It seems to me that SON manipulators are not the best option as they are killing performance. It's better to put decoded data into needed data structures on the fly. I think it would be handy if I could tell
Is there any reason why |
Hi there.
I'm trying to implement custom BSON decoding logic, that decodes data to different object types. And I need to store some decoder state at different recursion level. I think it would be handy if
CodecOptions
would allowdocument_class
to be either class or instance, having a special_type_marker
attribute. For that I've reordered conditions becauseissubclass
call fails when it's called on non-class object.