Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jackson serializer and deserializer for UUID #155

Closed
chakwok opened this issue Nov 21, 2019 · 17 comments
Closed

jackson serializer and deserializer for UUID #155

chakwok opened this issue Nov 21, 2019 · 17 comments

Comments

@chakwok
Copy link

chakwok commented Nov 21, 2019

I am interacting my kotlin program with a collection that has an id field of type UUID(subtype4), instead of type LUUID(subtype3), which is represented as java.util.UUID in the Kotlin program.

The problem is that when I retrieve the object back from the db, I get a org.bson.types.Binary Object. It makes sense but I want it to be a 'java.util.UUID` object in the program.

Is it possible to overwrite the existing object mapper to support such transformation?

I have these two helper functions to transform from/to UUID and Binary object
        fun toStandardBinaryUUID(uuid: UUID): Binary {...}
        fun fromStandardBinaryUUID(binary: Binary): UUID {...}

Thanks

@zigzago
Copy link
Member

zigzago commented Nov 21, 2019

I assume you use the default kmongo version - with jackson mapping.

I suggest you register a new jackson Module with KMongoConfiguration#registerBsonModule method, that defines a custom UUID serializer and deserializer.

HTH

@chakwok
Copy link
Author

chakwok commented Nov 22, 2019

Thanks for your reply @zigzago

Just to clarify by registering a custom serializer and deserializer, I will have to add a custom ser&deser for every class and their subclass that have the field uuid?

Sorry if that doesn't sound right, I only started to use KMongo and the Jackson library yesterday

For example, I have a Signal class that has the uuid field

class SignalSerializer(t: Class<Signal>): StdSerializer<Signal>(t){
    @Throws(IOException::class)
    override fun serialize(signal: Signal, jsonGenerator: JsonGenerator,
                           serializerProvider: SerializerProvider?) {
        jsonGenerator.writeStartObject()
        jsonGenerator.writeFieldName("signalId")
        jsonGenerator.writeObject(toStandardBinaryUUID(signal.signalId)) //of type UUID

        // write other (key, value) here
        jsonGenerator.writeStringField("content", signal.content)
        // ...

        jsonGenerator.writeEndObject()
    }
}

@chakwok
Copy link
Author

chakwok commented Nov 22, 2019

I just realized that I could just define a custom UUID serializer and deserializer and that solves the problem already.

Would you consider adding such a module as the default behavior? The java.util.UUID is stored as LUUID in db. The subtype 0x03 LUUID may have Interoperability issues when interpreted by different languages. It is suggested to use subtype 0x04 UUID instead.

@zigzago
Copy link
Member

zigzago commented Nov 23, 2019

A PR would be welcome :)

@zigzago zigzago changed the title Overriding the default Object Mapping to auto transform a field jackson serializer and deserializer for UUID Nov 23, 2019
@chakwok
Copy link
Author

chakwok commented Nov 23, 2019

Sure, I will give it a try.

I am thinking about extending the default jackson UUIDSerializer and UUIDDeserializer to support the mapping from both UUID subtype 0x03 and 0x04 and a configurable setting that allows user to choose in which format they would love to write to the db (with default set to 0x03 so it won't affect existing project). Does that sound right to you?

By the way, I am not able to parse back the org.bson.types.Binary object back when deserializing. Can you please have a look for me here?

zigzago added a commit that referenced this issue Nov 24, 2019
@zigzago
Copy link
Member

zigzago commented Nov 24, 2019

Question answered - but I'm not sure if it is a good idea to store an UUID as binary, and not as a string. Because querying a string is easier in mongo shell ;)

@chakwok
Copy link
Author

chakwok commented Nov 25, 2019

yea, I get your concern, especially when sub-type 0x04 is not natively supported in several languages. But storing as UUID seems to be the right choice if scalability is concerned.

uuid as UUID() vs string
It's worth emphasizing that storing as UUID will save significant space in the database. Your index size will be reduced by nearly half when storing as BinData vs String, and the document size will also be reduced. This means that you will be able to keep more index data and more document data in memory (cache), and save space on disk

@dpacierpnik
Copy link

Hello, we are also interested in this fix (we also have to store UUID as subtype 0x04). Any updates on this issue ? Is there any progress ?

@chakwok
Copy link
Author

chakwok commented May 19, 2020

@dpacierpnik
It seems that the issue has been addressed in the mongodb java driver 3.12. A custom deserializer may not be necessary?

Improved interoperability when using the native UUID type
The driver now supports setting the BSON binary representation of java.util.UUID instances via a new UuidRepresentation property on MongoClientSettings. The default representation has not changed, but it will in the upcoming 4.0 major release of the driver. Applications that store UUID values in MongoDB can use this setting to easily control the representation in MongoDB without having to register a Codec in the CodecRegistry.

See MongoClientSettings.getUuidRepresentation for details.

@marcdejonge
Copy link
Contributor

So in the latest version of the MongoDB client you can set the UUID type on Legacy, such that it is compatible. But we really want to use the new standard representation. I've already implemented a serializer and deserializer that work in my project. I'll go and see if I can create a PR for this project. I think that these settings should be part of the KMongoConfiguration, right?

@chakwok
Copy link
Author

chakwok commented May 19, 2020

@marcdejonge @dpacierpnik
What I was trying to say is that the conversion from java.util.UUIDto Mongodb's BSON UUID(subtype4) is supported by Mongdb driver natively starting from driver version 3.12

fun main() {
    val client = MongoClients.create(MongoClientSettings.builder()
            .uuidRepresentation(UuidRepresentation.STANDARD)
            .applyConnectionString(ConnectionString("mongodb://localhost:27017"))
            .build()

    )

    val col = client.getDatabase("test").getCollection("test")
    col.insertOne(Document("uuid", UUID.randomUUID()))
}
/* Inserted Document
{
    "_id" : ObjectId("5ec3d176a6017e19fc5b2c5b"),
    "uuid" : UUID("701e0e3b-34a9-4b23-8fba-799dae1803a0")
}
*/

@marcdejonge
Copy link
Contributor

The MongoDb isn't the problem, the problem is the custom bson4jackson library that is translating custom objects (which can contain UUID's) to bson. This one actually doesn't support it.

@marcdejonge
Copy link
Contributor

Maybe this pull requests also makes clear the changes I'd love to see. I do have a problem that I find it difficult to find the right place to configure these things. I'd love to rethink the way configuration is handle in KMongo, because it's not really extensible right now. Unless I'm overlooking some code, any hints on how to improve this?

@chakwok
Copy link
Author

chakwok commented May 20, 2020

@zigzago

@zigzago
Copy link
Member

zigzago commented May 20, 2020

@marcdejonge Thank you for the PR

Does KMongoConfiguration.registerBsonModule fill your needs? It allows to register any custom jackson module. Or do you need other kind of configuration?

@marcdejonge
Copy link
Contributor

marcdejonge commented May 20, 2020

Sorry, I wasn't totally clear. Right now the KMongoConfiguration is an object (basically a singleton) that already loads the bsonMapper during init. The way I've solved it now is by being able to load a new bsonMapper with different configuration, but I would prefer it that this configuration is done during the configuration of the client. Even better would be some KMongoClientBuilder that makes it explicit that you can configure stuff first, and then create the client from that. I think something like that could be done in the future, but I think it would be a breaking change unfortunately. If you want I could make a proposal how this could look, but I would need a bit more information which configuration specific to KMongo now exists. Is it only this KMongoConfiguration object?

@zigzago
Copy link
Member

zigzago commented May 21, 2020

Ok :). I've just created the ticket #206 with some info about this topic. For the UUID, I'm going to add a small additional class to help configuring the UuidRepresentation - it will close #155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants