Proposal: allow adjusting array/map block header format #264

meandnano · 2023-06-08T13:51:08Z

Hello,

First of all, thanks for the great library. We've been using it for a number of years now on a large scale project, and it's been working great.

Recently I have ported another service which encodes/decodes avro messages from scala to go using hamba/avro and noticed the difference in OCF files produced by this 2 logically similar services. The noticeable difference was in the fact that even though OCF files are valid and contain the same data, the consumer of OCF (in our case, BigQuery) was not always able to read the files produced by the go service (failing with an error that explains nothing).

After long investigation and debugging it was found that the difference was in the way the 2 services were encoding arrays and maps.
According to Avro specification encoded array blocks are being prefixed with either 1 or 2 long values: block length (mandatory) and block size in bytes (optional). The official java implementation of Avro library uses singe long value (block length), while hamba/avro always writes both (negative block length and block size in bytes). Adjusting hamba/avro to write only block length has fixed the issue.

While this is not a bug report and solely a problem of whatever decoder BigQuery uses (and we are reporting it to them right now), I think it would be nice to have a support for both ways in hamba/avro.

This could be achieved with one more configuration option (which will default to the current behavior). The change is trivial and
could be found in my fork along with docs and tests, if this
proposal is considered legit, I can create a PR.

Cheers!

nrwiersma · 2023-06-08T14:15:43Z

Hi,

A PR for this would be welcomed. Thanks.

meandnano mentioned this issue Jun 8, 2023

feat: allow array block contain only length #265

Merged

nrwiersma closed this as completed Jun 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: allow adjusting array/map block header format #264

Proposal: allow adjusting array/map block header format #264

meandnano commented Jun 8, 2023 •

edited

Loading

nrwiersma commented Jun 8, 2023

Proposal: allow adjusting array/map block header format #264

Proposal: allow adjusting array/map block header format #264

Comments

meandnano commented Jun 8, 2023 • edited Loading

nrwiersma commented Jun 8, 2023

meandnano commented Jun 8, 2023 •

edited

Loading