Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are you interested in using the official Apache.Avro NuGet package? #1033

Closed
blachniet opened this issue Aug 18, 2019 · 23 comments
Closed

Are you interested in using the official Apache.Avro NuGet package? #1033

blachniet opened this issue Aug 18, 2019 · 23 comments

Comments

@blachniet
Copy link

Are you interested in switching to the official Apache.Avro NuGet package?

At the time that the Confluent fork was created (resulting in the very popular Confluent.Apache.Avro package) there was no official Avro NuGet package and the project did not target .NET Standard. Both of those problems, and a few others, have been addressed in the official project.

If this is something you are interested in, I will figure out what else the official project might be missing that your fork has.

@mhowlett
Copy link
Contributor

yes - we'd much prefer to not maintain our own fork. in addition to targeting .net standard and making the nuget package, we just fixed/changed a few minor things IIRC (small number of loc). we also made a nuget tool package for avrogen.

Getting switched over to the official package is just a matter of prioritization. I didn't see much urgency until the official repo started to diverge significantly, or we wanted to prioritize significant improvements (it'd probably be a bad idea to do that in a fork). logical types is the most frequent request.

i'll try to find time to enumerate the changes we made and add them in another comment in this issue.

@mhowlett
Copy link
Contributor

ok, i believe this is a complete list of changes below (none are especially big) - it would be good to address each one by one and awesome to not have to worry about a diverging fork!

confluentinc/avro#16
note: some of our customers and also microsoft require this. also relevant: https://www.pedrolamas.com/2018/09/11/start-strong-naming-your-assemblies/

confluentinc/avro#11

confluentinc/avro#7

confluentinc/avro#6

in: confluentinc/avro#5 the changes to lang/csharp/src/apache/main/CodeGen/CodeGen.cs

confluentinc/avro#2 (probably not important?)

confluentinc/avro#1 (potentially something you might not like)

@blachniet
Copy link
Author

blachniet commented Aug 20, 2019

Thank you for this list! I'll go through these and ensure we have answers for these in the official package. I'll update this comment as I make progress.

@blachniet
Copy link
Author

I'm trying to understand the context around confluentinc/avro#11. Do you happen to know what code was calling ObjectCreator.FindType with a name that contained IList or Nullable? Was this something within the Avro library that was calling this method, or was this something in other code using the Avro library?

I'm asking this because so far I haven't seen a path within Avro code that would call this method this way. I want to build a unit test, but I want to make sure I fully understand the original problem first.

@blachniet
Copy link
Author

Ok, so I found how calls from SpecificReader could contain IList, but specific reader handles stripping the extras (getTargetType). So, I don't think that piece of code was causing the original problem.

Again, I just want to make sure I understand the original problem. Any insight you can provide is appreciated. Thanks!

@mhowlett
Copy link
Contributor

This was in response to a customer issue:

We are getting the following deserialization error when using C# classes generated from AvroGen 1.7.7.5 (https://www.nuget.org/packages/Confluent.Apache.Avro.AvroGen/1.7.7.5)

Unable to find type System.Collections.Generic.IList`1[com.tm.archtics.mobilink.Column] in all loaded assemblies in field rows in field inserts

The schema in question (anonymized by hand, so hopefully i didn't muck anything up): https://gist.github.com/mhowlett/3a2975c68cc02f2539abc779a8af42ec

@blachniet
Copy link
Author

I was unable to generate C# classes from the provided gist due to "ResultSet" (3 instances) and "ColumnMeta" (1 instance) being undefined. If you could provide an updated gist, I would greatly appreciate it. In the meantime, I will continue digging to see if I can reproduce the problem.

@blachniet
Copy link
Author

Nevermind, I'm pretty sure I figured how to reproduce the problem. It can be reproduced with a list of lists and a list of nullables. I'll push up unit tests showing the problem soon.

@blachniet
Copy link
Author

While investigating this I encountered some related issues which documented in AVRO-2522. However, I'm not confident that I've solved the original issue reported by your customer. I should have fixes for AVRO-2522 up for review sometime this week, but I may need an updated example from the customer issue that we can successfully generate C# classes from (if possible) to ensure those issues have been addressed fully.

@mhowlett
Copy link
Contributor

mhowlett commented Sep 3, 2019

Thanks for your efforts! here's that schema again somewhat less aggressively sanitized: https://gist.github.com/mhowlett/1d7518c23c8f5462a6d1ebbe81cbffb6

Last week, another one of our customers spoke to us about a new .NET Avro implementation they've built and are using in preference to the Apache one: https://github.com/randomravings/avro-dotnet

I'm going to schedule some time to take a bit of a closer look (will probably make & experiment with some Kafka serdes that use it). Superficially this looks pretty good to me - interested to know what you think.

@dstelljes
Copy link

We (C.H. Robinson, another Confluent customer) also just open sourced an in-house Avro implementation: https://github.com/ch-robinson/dotnet-avro

If not totally relevant here, it might be useful for comparative purposes. We use a similar strategy, generating serde functions with expression trees.

@blachniet
Copy link
Author

Hey all,

I'm not surprised to see that others have either made forks or created their own ground-up alternatives. The official C# bindings went quite a while without much love. 😞

I imagine that whichever one Confluent chooses as their default, someone will create forks of Confluent.SchemaRegistry.Serdes that work with the other implementations.

I have not reviewed the others yet, so I cannot provide objective guidance as to which is better. The official bindings we're written a while ago, and therefore do not currently take advantage of some new language features and best practices. I hope to change some of that in the future, but that's where we stand now.

@dstelljes
Copy link

For our part, we'd happily switch back to the official C# implementation as soon as it's able to map to POCOs—having an API that "just works" was our primary motivation. Glad to hear that development is happening on the official implementation again 😃

@mkellogg
Copy link
Contributor

Would this include updating to the latest spec (1.9.*)? We'd love to be able to better support enum schema evolution with default token values (for when the token produced isn't known by the consumer).

Could some of that maybe even get merged into the Confluent fork? I realize this could be construed as wasted effort if the goal is to get back to the official apache.avro package. But, if that's a longer term goal, maybe getting some of these features in the short term might be worth it?

@mhowlett
Copy link
Contributor

From @confluentinc 's point of view, we'd like to move away from our fork and reference the official apache packages. @blachniet - I think we're really close on that, what's status on the deltas? Maybe we can target v1.3 of Confluent.Kafka for this move? It would be fitting as there's a number of other avro fixes in it.

I've now been made aware of 4 separate implementations, or significant changes to the apache distro (two listed in this thread). Like @blachniet, I haven't reviewed these and can't objectively comment on overall quality, though I expect there's some good work here. We're wary of making one of these our official dependency because the long term maintenance story isn't as good. We hope this work can be utilized somehow however. One way is linking to alternate 'unofficial' serdes from our repo (even making these ourselves). Another is monitoring the situation and help get some of this work adopted into the apache distro.

We currently don't have the resources to adopt and start maintaining an avro implementation (though it's not out of the question in the future).

One final note: Something that is likely in the not too distant future is first class support for other serialization formats across Confluent Platform that have better support in .NET. Avro is going to continue to be an important serialization format in CP however as it's well matched to the use case.

@blachniet
Copy link
Author

I've updated my checklist to indicate that we have addressed confluentinc/avro#7 in the v1.9.1 release.

I still need to determine whether our changes in AVRO-2522 resolved the issues y'all saw in confluentinc/avro#11. I'll try to determine that today.

@blachniet
Copy link
Author

Quick update: AVRO-2522 did not resolve the issues seen in confluentinc/avro#11. I'm trying to address this in AVRO-2606.

@mhowlett
Copy link
Contributor

thanks for your work on this @blachniet !

@YairHalberstadt
Copy link
Contributor

It looks like AVRO-2606 has been resolved.
Are you interested in progressing this?

@mhowlett
Copy link
Contributor

mhowlett commented Jan 7, 2020

we will switch to the official avro dependency the next time they do a release.

@narmontas
Copy link

Just a reminder that apache/avro released 1.9.2 :)

@mhowlett
Copy link
Contributor

I just opened the PR: #1187 , it'll be in 1.4. strictly speaking this is a breaking change because support for net452 has been dropped, however we debated it an decided the change should go in v1.4 rather than waiting for v2.0.

@mhowlett
Copy link
Contributor

merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants