Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdc: Avro number roundtrip error on changefeed creation #38058

Closed
tim-o opened this issue Jun 6, 2019 · 10 comments · Fixed by #59075
Closed

cdc: Avro number roundtrip error on changefeed creation #38058

tim-o opened this issue Jun 6, 2019 · 10 comments · Fixed by #59075
Assignees
Labels
A-cdc Change Data Capture T-cdc

Comments

@tim-o
Copy link
Contributor

tim-o commented Jun 6, 2019

This Github issue is synchronized with Zendesk:

Ticket ID: #3388
Group: Support
Requester: [email protected]
Organization: Kindred Group
Assignee: Ricardo Rocha
Issue escalated by: Ricardo Rocha

Original ticket description:

Hi, I am trying to enable a changefeed from CRDB to Kafka but I get an error I do not understand. The exact same command works for a less complex table with less data.

Actual SQL:
CREATE CHANGEFEED FOR TABLE tx_engine_si1.tx INTO 'kafka://kafka-0.pt1.d2-dev.aws.kindredgroup.com:20000' WITH confluent_schema_registry = 'http://pt1-schema-registry.d2-dev.aws.kindredgroup.com:80', format = 'experimental_avro'

Result: Failed, with error as follows:
internal error: uncaught error: 1E+2 will not roundtrip at scale 2
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.decimalToRat
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/avro.go:593
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.columnDescToAvroSchema.func19
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/avro.go:263
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.columnDescToAvroSchema.func27
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/avro.go:320
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.(*avroDataRecord).nativeFromRow
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/avro.go:458
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.(*avroEnvelopeRecord).BinaryFromRow
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/avro.go:572
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.(*confluentAvroEncoder).EncodeValue
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/encoder.go:358
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.emitEntries.func1
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/changefeed.go:188
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.emitEntries.func2
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/changefeed.go:236
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.(*changeAggregator).tick
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/changefeed_processors.go:270
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.(*changeAggregator).Next
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/changefeed_processors.go:252
github.com/cockroachdb/cockroach/pkg/sql/distsqlrun.Run
/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/base.go:174
github.com/cockroachdb/cockroach/pkg/sql/distsqlrun.(*ProcessorBase).Run
/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/processors.go:801
github.com/cockroachdb/cockroach/pkg/sql/distsqlrun.(*Flow).startInternal.func1
/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/flow.go:566
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1333

@ricardocrdb ricardocrdb self-assigned this Jun 6, 2019
@ricardocrdb
Copy link

Closing due to inactivity. If you are still having the issue, please feel free to respond to this thread. We want to help!

@hougaardj
Copy link

We have run into the same error when we tried to use the TPCC benchmark to test CDC performance. The changefeed job error'ed out with: 6E+2 will not roundtrip at scale 2.

Init the benchmark:

cockroach workload init tpcc "<cnx string>" --db tpcc --drop --warehouses 300 --split --data-loader IMPORT

Create the change feed:

use tpcc;
CREATE CHANGEFEED FOR TABLE customer,district,history,item,new_order,"order",order_line,stock,warehouse INTO '<kafka url?' WITH format = experimental_avro, confluent_schema_registry = '<schema_url>';

The error seems to arise from: https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/changefeedccl/avro.go#L676

@shermanCRL
Copy link
Contributor

@HonoreDB Would you characterize the difficulty here? Is this a resolvable bug, is there a workaround, or is this a limitation of Avro that can’t be fixed?

@shermanCRL shermanCRL added A-cdc Change Data Capture T-cdc labels Jan 15, 2021
@shermanCRL shermanCRL changed the title Error when creating changefeed cdc: Avro number roundtrip error on changefeed creation Jan 15, 2021
@shermanCRL
Copy link
Contributor

shermanCRL commented Jan 15, 2021

See also #38013 and #37508

@shermanCRL
Copy link
Contributor

We might also consider upgrading the Avro SDK.

@hougaardj
Copy link

It could look like the apd.Decimal, given to decimalToRat(), is just not normalized such that scale and -dec.Exponent do not match.

@HonoreDB
Copy link
Contributor

I like #38013 as a solution here. It's debatable whether this is a bug; I think what's going on is that the Avro type is fixed precision so it can't capture the distinction between 600 and 600.00, so our encoder errors out rather than lose that tiny amount of information. Reasonable people might disagree about whether that's a good idea, but either way falling back to a string encoding would fix more clearly legit issues.

@shermanCRL
Copy link
Contributor

Sounds like the error is a correct, conservative thing to do – we probably don’t want lossy transformations to happen quietly. We might not know a priori if the loss is ’trivial’ from the user’s perspective.

Have we considered a WITH flag like allow_precision_reduction or some such thing? That might be the more practical answer. cc @amruss

@hougaardj
Copy link

6E+2 is also coming from a fixed precision SQL type of the form DECIMAL(n, m), where m=2. However the apd.Decimal does not seem to have been normalized such that the exponent exactly match -m. If Cockroach had loaded the DECIMAL value and scaled it to match -m, then 6E+2 would have been 60000E-2 and decimalToRat() would not have failed.

@hougaardj
Copy link

All your tests also assume that the apd.Decimal is normalized to match the scale

func TestDecimalRatRoundtrip(t *testing.T) {
	defer leaktest.AfterTest(t)()
	defer log.Scope(t).Close(t)

	t.Run(`table`, func(t *testing.T) {
		tests := []struct {
			scale int32
			dec   *apd.Decimal
		}{
			{0, apd.New(0, 0)},
			{0, apd.New(1, 0)},
			{0, apd.New(-1, 0)},
			{0, apd.New(123, 0)},
			{1, apd.New(0, -1)},
			{1, apd.New(1, -1)},
			{1, apd.New(123, -1)},
			{5, apd.New(1, -5)},
		}
		for d, test := range tests {
			rat, err := decimalToRat(*test.dec, test.scale)
			require.NoError(t, err)
			roundtrip := ratToDecimal(rat, test.scale)
			if test.dec.CmpTotal(&roundtrip) != 0 {
				t.Errorf(`%d: %s != %s`, d, test.dec, &roundtrip)
			}
		}
	})

@craig craig bot closed this as completed in b5f4391 Jan 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants