Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query wrong result 18.14.8(9) (CH cashes ? subquery result) #3410

Closed
den-crane opened this issue Oct 17, 2018 · 16 comments
Closed

Query wrong result 18.14.8(9) (CH cashes ? subquery result) #3410

den-crane opened this issue Oct 17, 2018 · 16 comments
Assignees
Labels
bug Confirmed user-visible misbehaviour in official release

Comments

@den-crane
Copy link
Contributor

den-crane commented Oct 17, 2018

Preparing:

CREATE TABLE dt(tkey Int32) ENGINE = MergeTree order by tuple();
insert into dt values (300000);
CREATE TABLE testx(t Int32, a UInt8) ENGINE = MergeTree ORDER BY tuple();
INSERT INTO testx VALUES (100000, 0);
select count(*) from testx where not a and t < (select tkey from dt);
1
drop table dt;
CREATE TABLE dt(tkey Int32) ENGINE = MergeTree order by tuple();
insert into dt values (0);

BUG:
select count(*) from dw.testx where not a and t < (select tkey from dt);
1
select tkey from dt
0
select count(*) from dw.testx where not a and t < 0
0

So CH does not see that value in the table dt has changed from 300000 to 0.

optimize_move_to_prewhere 0
enable_optimize_predicate_expression 0

@den-crane den-crane changed the title Query wrong result 18.14.8 (CH cashes ? subquery result) Query wrong result 18.14.8(9) (CH cashes ? subquery result) Oct 17, 2018
@den-crane
Copy link
Contributor Author

den-crane commented Oct 18, 2018

Workaround: move all other conditions to prewhere and leave the condition with subquery in where section.

select count(*) from testx
prewhere not a
where t < (select tkey from dt);

@den-crane
Copy link
Contributor Author

select count(*) from dw.testx where not a and t < (select tkey from dt);
┌─count()─┐
│ 1 │
└─────────┘
1 rows in set. Elapsed: 0.002 sec.

set compile_expressions = 0;

select count(*) from dw.testx where not a and t < (select tkey from dt);
0 rows in set. Elapsed: 0.002 sec.

@den-crane
Copy link
Contributor Author

In our system we don't drop/create dt.
It's a replacingmergetree table and it's data is different for every query.
And more complicated
time_key < (select time_key from dim_time where time_stamp = (select max(last_record_time) from data_status_current))
dim_time is constant, data_status_current has different max data every time.

@blinkov blinkov assigned blinkov and alesapin and unassigned blinkov Oct 19, 2018
@blinkov blinkov added bug Confirmed user-visible misbehaviour in official release issue labels Oct 19, 2018
@alesapin
Copy link
Member

Den, I can't reproduce the problem from your example:
My steps:

:) CREATE TABLE dt(tkey Int32) ENGINE = MergeTree order by tuple();

Ok.

0 rows in set. Elapsed: 0.002 sec. 
:)  insert into dt values (300000);

Ok.

1 rows in set. Elapsed: 0.002 sec.
:)  CREATE TABLE testx(t Int32, a UInt8) ENGINE = MergeTree ORDER BY tuple();

Ok.

0 rows in set. Elapsed: 0.002 sec. 
:)  INSERT INTO testx VALUES (100000, 0);

Ok.

1 rows in set. Elapsed: 0.002 sec.
:)  select count(*) from testx where not a and t < (select tkey from dt);

┌─count()─┐
│       1 │
└─────────┘

1 rows in set. Elapsed: 0.004 sec. 

:) drop table dt;

Ok.

0 rows in set. Elapsed: 0.001 sec. 

:)  CREATE TABLE dt(tkey Int32) ENGINE = MergeTree order by tuple();

Ok.

0 rows in set. Elapsed: 0.003 sec.

:)  insert into dt values (0);

Ok.

1 rows in set. Elapsed: 0.002 sec. 
:)  select count(*) from testx where not a and t < (select tkey from dt);

┌─count()─┐
│       0 │
└─────────┘

1 rows in set. Elapsed: 0.004 sec. 

In last select query you use dw.testx table instead of testx, maybe this is the cause of wrong answer?

About expressions compilation:

  1. Expressions are compiled when 3 identical expressions occur in queries. If you queried your select less then three times, then it's not compile expressions problem.
  2. There is special asynchronous_metric in table system.asynchronous_metrics called CompiledExpressionCacheCount. Even after three times I called this expression metric doesn't increment. It means that this expression is too simple to be compiled.

Maybe you can provide more complex example?

@den-crane
Copy link
Contributor Author

select name, value from system.settings where changed;

┌─name──────────────────────────────────────┬─value────────────┐
│ connect_timeout_with_failover_ms │ 1000 │
│ use_uncompressed_cache │ 1 │
│ optimize_move_to_prewhere │ 0
│ load_balancing │ nearest_hostname │
│ compile_expressions │ 0 │
│ distributed_aggregation_memory_efficient │ 1 │
│ skip_unavailable_shards │ 1 │
│ http_connection_timeout │ 15 │
│ empty_result_for_aggregation_by_empty_set │ 1
│ max_bytes_before_external_group_by │ 25374640128 │
│ max_memory_usage │ 50749280256 │
└───────────────────────────────────────────┴──────────────────┘

@den-crane
Copy link
Contributor Author

it's tricky and works with engine = memory and sometimes caches other value

screen shot 2018-10-19 at 4 50 28 pm

@alesapin
Copy link
Member

Can you provide select * from system.asynchronous_metrics ?

@den-crane
Copy link
Contributor Author

screen shot 2018-10-19 at 2 53 12 pm

@den-crane
Copy link
Contributor Author

SELECT *
FROM system.asynchronous_metrics

┌─metric──────────────────────────────────┬──────value─┐
│ jemalloc.background_thread.run_interval │          0 │
│ jemalloc.background_thread.num_runs     │          0 │
│ jemalloc.background_thread.num_threads  │          0 │
│ jemalloc.retained                       │  557051904 │
│ jemalloc.mapped                         │ 2941521920 │
│ jemalloc.resident                       │ 2835759104 │
│ jemalloc.metadata_thp                   │          0 │
│ jemalloc.metadata                       │   15238536 │
│ jemalloc.allocated                      │ 2706048264 │
│ UncompressedCacheCells                  │        110 │
│ MarkCacheFiles                          │        110 │
│ ReplicasSumInsertsInQueue               │          0 │
│ MarkCacheBytes                          │       1760 │
│ UncompressedCacheBytes                  │    1420133 │
│ CompiledExpressionCacheCount            │          4 │
│ Uptime                                  │        723 │
│ ReplicasMaxQueueSize                    │          0 │
│ ReplicasMaxInsertsInQueue               │          0 │
│ ReplicasMaxMergesInQueue                │          0 │
│ ReplicasMaxRelativeDelay                │          0 │
│ jemalloc.active                         │ 2719969280 │
│ CompiledExpressionCacheBytes            │       4560 │
│ ReplicasSumQueueSize                    │          0 │
│ ReplicasMaxAbsoluteDelay                │          0 │
│ ReplicasSumMergesInQueue                │          0 │
│ MaxPartCountForPartition                │         10 │
└─────────────────────────────────────────┴────────────┘

@alesapin
Copy link
Member

yes, reproduced with:

optimize_move_to_prewhere 0
enable_optimize_predicate_expression 0

@den-crane
Copy link
Contributor Author

yes, I think optimize_move_to_prewhere 1
transform it to
select count(*) from testx prewhere t <(subquery) where not a

@alesapin
Copy link
Member

alesapin commented Oct 19, 2018

Yes. This problem seems to be very special case. Сondition t < (subquery) executes only when subquery returns single element.

@alesapin
Copy link
Member

@den-crane
Copy link
Contributor Author

den-crane commented Oct 23, 2018

t.me/clickhouse_ru/71285

ch0 :) select eventDate, count() from Events where eventDate = today()-1 group by eventDate settings compile_expressions = 0;

┌──eventDate─┬──count()─┐
│ 2018-10-22 │ 70056042 │
└────────────┴──────────┘

1 rows in set. Elapsed: 0.069 sec. Processed 70.06 million rows, 140.12 MB (1.02 billion rows/s., 2.04 GB/s.)

ch0 :) select eventDate, count() from Events where eventDate = today()-1 group by eventDate settings compile_expressions = 1;

┌──eventDate─┬─count()─┐
│ 2018-10-21 │ 758 │
└────────────┴─────────┘

1 rows in set. Elapsed: 0.017 sec. Processed 70.06 million rows, 140.12 MB (4.13 billion rows/s., 8.27 GB/s.)

@alesapin
Copy link
Member

We disable compile expressions by default in 14.10.

@alesapin
Copy link
Member

Seems to be fixed in master #3457. But we will test this feature more carefully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed user-visible misbehaviour in official release
Projects
None yet
Development

No branches or pull requests

3 participants