Skip to content
This repository has been archived by the owner on Dec 28, 2017. It is now read-only.

Integer overflow in some case #142

Open
Novemser opened this issue Nov 15, 2017 · 7 comments
Open

Integer overflow in some case #142

Novemser opened this issue Nov 15, 2017 · 7 comments
Assignees
Labels

Comments

@Novemser
Copy link
Contributor

SQL:

select A.tp_bigint,B.id_dt from full_data_type_table A join full_data_type_table B on A.id_dt > B.id_dt * 16 where A.tp_bigint = B.id_dt order by A.id_dt

Throws:

Caused by: com.pingcap.tikv.exception.TiClientInternalException: Error reading region
  at com.pingcap.tikv.operation.SelectIterator.readNextRegion(SelectIterator.java:148)
  at com.pingcap.tikv.operation.SelectIterator.hasNext(SelectIterator.java:161)
  at org.apache.spark.sql.tispark.TiRDD$$anon$2.hasNext(TiRDD.scala:75)
  at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
  at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
  at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
  at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
  at org.apache.spark.scheduler.Task.run(Task.scala:99)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: com.pingcap.tikv.exception.SelectException: unknown error Codec(Other(StringError("I64(4355836469450447576) * I64(16) overflow")))
  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
  at com.pingcap.tikv.operation.SelectIterator.readNextRegion(SelectIterator.java:145)
  ... 13 more
Caused by: com.pingcap.tikv.exception.SelectException: unknown error Codec(Other(StringError("I64(4355836469450447576) * I64(16) overflow")))
  at com.pingcap.tikv.region.RegionStoreClient.coprocessorHelper(RegionStoreClient.java:192)
  at com.pingcap.tikv.region.RegionStoreClient.coprocess(RegionStoreClient.java:185)
  at com.pingcap.tikv.operation.SelectIterator.createClientAndSendReq(SelectIterator.java:130)
  at com.pingcap.tikv.operation.SelectIterator.lambda$submitTasks$2(SelectIterator.java:113)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  ... 3 more

Seems there's an overflow issue here.

Note that if we remove * 16 in the sql, the above exception won't be thrown.

@Novemser Novemser changed the title Error comparing integer with bigint in some case Integer overflow in some case Nov 15, 2017
@Novemser
Copy link
Contributor Author

Spark plan:

   :- Project [id_dt#0L, tp_bigint#8L]
   :  +- Filter ((isnotnull(id_dt#0L) && (id_dt#0L > (tp_bigint#8L * 16))) && isnotnull(tp_bigint#8L))

tp_bigint#8L * 16 may definitely cause an overflow issue, but we did't validate this filter, pushed it down to TiKV and caused the above problem.

@Novemser
Copy link
Contributor Author

I think spark plan generated here may not be appropriate, a CheckOverflow might have been added to the above filter like the following plan:

   :- Project [id_dt#0L, tp_bigint#8L]
   :  +- Filter (((cast(id_dt#0L as decimal(24,2)) > CheckOverflow((cast(cast(tp_bigint#8L as decimal(20,0)) as decimal(22,2)) * 2.22), DecimalType(24,2))) && isnotnull(id_dt#0L)) && isnotnull(tp_bigint#8L))

Related SQL:

select A.tp_bigint,B.id_dt from full_data_type_table A join full_data_type_table B on (A.id_dt > B.id_dt * 12.6) where A.tp_bigint = B.id_dt order by A.id_dt

@birdstorm
Copy link
Contributor

birdstorm commented Nov 15, 2017

tispark:

scala> testsql.explain
== Physical Plan ==
*Project [id_bigint#1L, id_int#26L]
+- *Sort [id_int#0L ASC NULLS FIRST], true, 0
   +- Exchange rangepartitioning(id_int#0L ASC NULLS FIRST, 200)
      +- *Project [id_bigint#1L, id_int#26L, id_int#0L]
         +- *SortMergeJoin [id_bigint#1L], [id_int#26L], Inner, (id_int#0L > (id_int#26L * 2))
            :- *Sort [id_bigint#1L ASC NULLS FIRST], false, 0
            :  +- Exchange hashpartitioning(id_bigint#1L, 200)
            :     +- TiDB CoprocessorRDD{
 Table: a
 Ranges: Start:[-9223372036854775808], End: [9223372036854775807]
 Columns: [id_int], [id_bigint]
 Filter: Not(IsNull([id_int])), Not(IsNull([id_bigint])), ([id_int] > ([id_bigint] Multiply 2))
}
            +- *Sort [id_int#26L ASC NULLS FIRST], false, 0
               +- Exchange hashpartitioning(id_int#26L, 200)
                  +- TiDB CoprocessorRDD{
 Table: a
 Ranges: Start:[-9223372036854775808], End: [9223372036854775807]
 Columns: [id_int]
 Filter: Not(IsNull([id_int]))
}

spark:

scala> testsql.explain
== Physical Plan ==
*Project [id_bigint#1L, id_int#50]
+- *Sort [id_int#0 ASC NULLS FIRST], true, 0
   +- Exchange rangepartitioning(id_int#0 ASC NULLS FIRST, 200)
      +- *Project [id_bigint#1L, id_int#50, id_int#0]
         +- *SortMergeJoin [id_bigint#1L], [cast(id_int#50 as bigint)], Inner, (id_int#0 > (id_int#50 * 2))
            :- *Sort [id_bigint#1L ASC NULLS FIRST], false, 0
            :  +- Exchange hashpartitioning(id_bigint#1L, 200)
            :     +- *Scan JDBCRelation(a) [numPartitions=1] [id_int#0,id_bigint#1L] PushedFilters: [*IsNotNull(id_int), *IsNotNull(id_bigint)], ReadSchema: struct<id_int:int,id_bigint:bigint>
            +- *Sort [cast(id_int#50 as bigint) ASC NULLS FIRST], false, 0
               +- Exchange hashpartitioning(cast(id_int#50 as bigint), 200)
                  +- *Scan JDBCRelation(a) [numPartitions=1] [id_int#50] PushedFilters: [*IsNotNull(id_int)], ReadSchema: struct<id_int:int>

we missed cast(id_int#50 as bigint) inside SortMergeJoin, not CheckOverflow(). @Novemser

@ilovesoup
Copy link
Contributor

Push it back to spark might solve the problem. Or promote it to larger type and push. But likely this implicit conversion is not supported in TiKV old interface. Anyway, we need a check before push, and fallback if not valid predicates. We have talked through it this afternoon. @birdstorm

@ilovesoup ilovesoup added the P1 label Nov 21, 2017
@ilovesoup
Copy link
Contributor

Need to fix after DAG interface.

@Novemser
Copy link
Contributor Author

Novemser commented Dec 1, 2017

Another case:

select A.id_dt,A.tp_bigint,B.id_dt from full_data_type_table A join full_data_type_table B on A.id_dt > B.id_dt * 16 where A.tp_bigint = B.id_dt order by A.id_dt, B.id_dt 

Exception:

Caused by: com.pingcap.tikv.exception.SelectException: unknown error Overflow
	at com.pingcap.tikv.region.RegionStoreClient.coprocessorHelper(RegionStoreClient.java:266)

@Novemser Novemser self-assigned this Dec 8, 2017
@Novemser
Copy link
Contributor Author

Novemser commented Dec 8, 2017

This issue is caused by bigint overflow from TiKV computation stage. To prevent this from happening, we could let bigint calculation remains in Spark and don't push it down to TiKV.

However, same issue occurs in TiDB and MySQL:
SQL:

select tp_int from full_data_type_table where tp_bigint * 20 > 0

TiDB:

ERROR 1105 (HY000): other error: unknown error Overflow

MySQL:

ERROR 1690 (22003): BIGINT value is out of range in '(`tispark_test`.`full_data_type_table`.`tp_bigint` * 20)'

It seems that both of them don't have a fallback path to handle this scenario.

But in Spark with JDBC, operation on potential overflow calculation cases will not be pushed down.
Like this:

== Physical Plan ==
*Project [tp_int#84]
+- *Filter ((tp_bigint#80L * 20) > 0)
   +- *Scan JDBCRelation(tispark_test.full_data_type_table) [numPartitions=1] [tp_int#84,tp_bigint#80L] PushedFilters: [*IsNotNull(tp_bigint)], ReadSchema: struct<tp_int:int>

So here's the question, should we make our behavior consistent with TiDB/MySQL or Spark with JDBC? 🤥

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants