vendor+sql: update apd, change str representation of DECIMAL #17029

maddyblue · 2017-07-14T20:35:03Z

apd has had various improvements and API changes:

ToStandard is now Text('f')
ToIntegral renamed
Modf can now take nil, easing the implementation of trunc

Additionally, change our default decimal -> string implementation to
use the scientific notation (use exponents when needed) instead of the
so-called standard (a name I made up) representation. The 'standard'
representation was used to match Postgres. It simply never prints
exponents. This causes two problems: large exponents are converted
into lots of zeros, and it is no longer possible to extract the
actual precision from the number if there are zeros appended to the
right, because it's not clear if those were in the original value or
were appended during the string conversion. The change to scientific
notation fixes both of these problems, allowing for smaller numbers
to be printed and allowing users to always know exactly how many
zeros were in their values.

There is a possibility that this change causes problems to users of
decimals if their string -> decimal parser doesn't support exponents. I
think we should allow users to report that problem and attempt to
deal with it if it comes up. Mostly I think it'll be just fine,
because pretty much everything that deals with decimals should know
about exponents.

This change allows for the exception in copy_in_test.go to be removed,
since trailing zeros are handled correctly now.

This also allows us to easily in the future allow session settings to
control decimal formatting, since all decimal formatting operations
now go through the Go fmt.Formatter interface, and thus accept the
standard e, f, and g verbs.

cockroach-teamcity · 2017-07-14T20:35:08Z

This change is

knz · 2017-07-14T20:46:04Z

@jordanlewis plz confirm this is unlikely to cause orm breakage

knz · 2017-07-14T20:46:10Z

code LGTM

jordanlewis · 2017-07-14T21:24:40Z

Hmm, this patch makes me nervous, even though I agree with all of your rationale for making this the new default. This is exactly the kind of change that strikes me as very likely to break clients.

Even though the new output format is valid as an input format for CockroachDB and Postgres, I think that there are almost certainly clients that expect the output format to be the same as Postgres's, and this will upset them.

What about the pgwire binary format? Did that change, or is that the same as before?

At the bare minimum we need to add tests for this to the various client tests (java_test, c_test etc).

RaduBerinde · 2017-07-14T23:49:52Z

I'm particularly worried about cases where people use decimals with things that are always just (potentially large) integers. Whatever is consuming those results might not like 1E+1 instead of 10.

Review status: 0 of 13 files reviewed at latest revision, all discussions resolved, all commit checks successful.

Comments from Reviewable

maddyblue · 2017-07-15T02:20:19Z

I've added a test for 1e1 to java_test.go, and it passes. This did not change the pgwire binary representation, just text.

I am sympathetic to these possible problems, but I think we should move forward just like we have with the other differences we have to postgres. If it causes too many client issues we can change it back or add a session setting.

Review status: 0 of 14 files reviewed at latest revision, all discussions resolved.

Comments from Reviewable

knz · 2017-07-15T15:25:21Z

haha, you got me there. You're right. I checked with both MySQL and Postgres and both allow the scientific notation for decimal/numeric. Let's put this through, and see if users complain.

LGTM

jordanlewis · 2017-07-15T16:35:56Z

Wait, I am still uncomfortable. I think it is a mistake to land this in 1.1 without more significant testing on our side. Thanks for adding the Java test. We need to check the C libpq, js Ruby and python too. Sorry for the high burden of proof but I strongly believe the onus falls on us to validate this, not our users.

maddyblue · 2017-07-16T00:47:55Z

How would you suggest I do the C version? Is there a standard bigdecimal library to use? There's no example code in c_test.go to use as a template like there was for java.

apd has had various improvements and API changes: - ToStandard is now Text('f') - ToIntegral renamed - Modf can now take nil, easing the implementation of trunc Additionally, change our default decimal -> string implementation to use the scientific notation (use exponents when needed) instead of the so-called standard (a name I made up) representation. The 'standard' representation was used to match Postgres. It simply never prints exponents. This causes two problems: large exponents are converted into lots of zeros, and it is no longer possible to extract the actual precision from the number if there are zeros appended to the right, because it's not clear if those were in the original value or were appended during the string conversion. The change to scientific notation fixes both of these problems, allowing for smaller numbers to be printed and allowing users to always know exactly how many zeros were in their values. There is a possibility that this change causes problems to users of decimals if their string -> decimal parser doesn't support exponents. I think we should allow users to report that problem and attempt to deal with it if it comes up. Mostly I think it'll be just fine, because pretty much everything that deals with decimals should know about exponents. This change allows for the exception in copy_in_test.go to be removed, since trailing zeros are handled correctly now. This also allows us to easily in the future allow session settings to control decimal formatting, since all decimal formatting operations now go through the Go fmt.Formatter interface, and thus accept the standard e, f, and g verbs.

maddyblue · 2017-07-17T02:05:18Z

I've added a nice python test. I also added a less nice ruby test. The ruby test is totally boring because, by default, the pg package just returns everything as a string. You have to manually instruct it to use a type decoder. If we do that, then it barfs on the OID of numeric (so, nothing to do with this change). Trivial to get it working by assigning the decoder to be something that can parse those strings, so I don't think it's a useful test. A useful test would be to hook up a full rails thing, which might have these decoders defined, and see what that does. (If someone wants that, I'll need help doing it since it's out of my familiarity.)

Adding a C test is similar to the ruby one (i.e., not useful): you just have to choose a decimal parsing library that knows how to parse exponents, there's no built in thing (see https://stackoverflow.com/questions/32651069/c-libpq-get-float-value-from-numeric).

knz · 2017-07-17T13:34:25Z

For C the most common is GMP/MPFR can you check (eg via their docs) whether they support the scientific syntax? -- Sent from my Android device with K-9 Mail. Please excuse my brevity.

jordanlewis · 2017-07-17T13:41:34Z

Thanks for adding the tests - I'm starting to feel better about this change.

I have a few more questions. I think that it might be user unfriendly to always print decimals in scientific notation even in the "normal" case of a number with no trailing zeros. Thoughts on that? Maybe we could output an ordinary value in the "normal" case? As a user, I think I would be confused and maybe even annoyed to see SELECT 100::decimal get outputted as 1E+2.

Also, can you elaborate on the problem with ambiguity around trailing zeroes? I don't understand where string conversion would introduce that ambiguity.

maddyblue · 2017-07-17T16:58:38Z

The exponents are not always printed, only when needed. See https://github.com/cockroachdb/apd/blob/master/testdata/base.decTest for the test cases. Also, 100 would not be printed as 1e2 because that doesn't show the full precision. All the string printing algorithm can do is choose to move the decimal within existing digits and adjust the exponent appropriately. But it only does this in some cases (https://github.com/cockroachdb/apd/blob/f03b0011b0efc3437be20fa3979ae89082de8861/format.go#L67). So it already does what you request, and outputs an ordinary value in the normal case.

To illustrate the problem, take the numbers you had above: 100, 1e2. Both are equal to 100, but the first one is more precise. The second is less precise, and using the compare_total algorithm (https://godoc.org/github.com/cockroachdb/apd#Decimal.CmpTotal), which is defined in the decimal spec, is not equal to 100, because it doesn't have exactly the same precision.

It's like saying: "I'm 1 mile away" and "I'm 1.00000" miles away. If users provide that precision, it should be kept and correctly displayed.

maddyblue · 2017-07-17T17:23:22Z

GMP supports exponents (https://gmplib.org/manual/Assigning-Floats.html#Assigning-Floats) in their mpf_set_str function.

knz · 2017-07-17T17:37:23Z

thanks.

jordanlewis · 2017-07-17T17:42:43Z

Got it - thanks for the tutorial. LGTM in that case.

I wonder if it's worth introducing a session setting for this kind of thing at some point down the line, to permit customizable output.

maddyblue · 2017-07-17T17:54:45Z

Yes, I so want decimal session settings. To control: formatting, rounding, context precision during arithmetic operations. Seems like that + distsql would make cockroach a really nice platform to do distributed scientific computing or something.

jseldess · 2017-07-19T10:00:00Z

@mjibson, can you help me understand any updates needed to the user-facing DECIMAL type docs?

maddyblue · 2017-07-19T18:03:38Z

@jess-edwards No docs need to be updated, since they don't list the text output format.

maddyblue requested review from knz and RaduBerinde July 14, 2017 20:35

knz changed the title ~~vendor: update apd~~ vendor+sql: update apd, change str representation of DECIMAL Jul 14, 2017

knz requested a review from jordanlewis July 14, 2017 20:45

maddyblue merged commit 1e87710 into cockroachdb:master Jul 17, 2017

maddyblue deleted the update-apd branch July 17, 2017 17:55

rafiss mentioned this pull request Apr 22, 2021

sql/pgwire: fix encoding of decimals with trailing 0s #64022

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vendor+sql: update apd, change str representation of DECIMAL #17029

vendor+sql: update apd, change str representation of DECIMAL #17029

maddyblue commented Jul 14, 2017

cockroach-teamcity commented Jul 14, 2017

knz commented Jul 14, 2017

knz commented Jul 14, 2017

jordanlewis commented Jul 14, 2017

RaduBerinde commented Jul 14, 2017

maddyblue commented Jul 15, 2017

knz commented Jul 15, 2017

jordanlewis commented Jul 15, 2017

maddyblue commented Jul 16, 2017

maddyblue commented Jul 17, 2017

knz commented Jul 17, 2017 via email

jordanlewis commented Jul 17, 2017

maddyblue commented Jul 17, 2017

maddyblue commented Jul 17, 2017

knz commented Jul 17, 2017

jordanlewis commented Jul 17, 2017

maddyblue commented Jul 17, 2017

jseldess commented Jul 19, 2017

maddyblue commented Jul 19, 2017

vendor+sql: update apd, change str representation of DECIMAL #17029

vendor+sql: update apd, change str representation of DECIMAL #17029

Conversation

maddyblue commented Jul 14, 2017

cockroach-teamcity commented Jul 14, 2017

knz commented Jul 14, 2017

knz commented Jul 14, 2017

jordanlewis commented Jul 14, 2017

RaduBerinde commented Jul 14, 2017

maddyblue commented Jul 15, 2017

knz commented Jul 15, 2017

jordanlewis commented Jul 15, 2017

maddyblue commented Jul 16, 2017

maddyblue commented Jul 17, 2017

knz commented Jul 17, 2017 via email

jordanlewis commented Jul 17, 2017

maddyblue commented Jul 17, 2017

maddyblue commented Jul 17, 2017

knz commented Jul 17, 2017

jordanlewis commented Jul 17, 2017

maddyblue commented Jul 17, 2017

jseldess commented Jul 19, 2017

maddyblue commented Jul 19, 2017