Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tiflash crash with 'Received signal Bus error' during query #8674

Closed
aytrack opened this issue Jan 8, 2024 · 4 comments · Fixed by #8767
Closed

tiflash crash with 'Received signal Bus error' during query #8674

aytrack opened this issue Jan 8, 2024 · 4 comments · Fixed by #8767
Assignees
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. component/compute report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.

Comments

@aytrack
Copy link

aytrack commented Jan 8, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

 create table t01 ( `COL1` enum('^YSQT0]V@9TFN>^WB6G?NG@S8>VYOM;BSC@<BCQ6'),  `COL2` mediumint(41) DEFAULT NULL,    `COL3` year(4) DEFAULT NULL,  KEY `U_M_COL4` (`COL1`,`COL2`), KEY `U_M_COL5` (`COL3`,`COL2`)  );
alter table t01 set tiflash replica 1;
insert into t01 values ('^YSQT0]V@9TFN>^WB6G?NG@S8>VYOM;BSC@<BCQ6', -1881752, 1986);
 SELECT * FROM t01 where col2 = -1881752 and col2 * -1881752 != 8366212;

2. What did you expect to see? (Required)

query success

3. What did you see instead (Required)

query_id:4, server_id:289, start_ts:446876954885881857,task_id:1>"] [thread_id=66]
[2024/01/08 15:51:37.895 +08:00] [INFO] [MPPTask.cpp:542] ["task starts running, time cost in schedule: 0 ms, time cost in preprocess: 2 ms"] [source="MPP<gather_id:1, query_ts:1704700297887677000, local_query_id:4, server_id:289, start_ts:446876954885881857,task_id:1>"] [thread_id=66]
[2024/01/08 15:51:37.895 +08:00] [INFO] [SegmentReadTaskScheduler.cpp:66] ["Added, pool_id=3 block_slots=8 segment_count=1 pool_count=1 cost=1.000us do_add_cost=1.000us"] [source="MPP<gather_id:1, query_ts:1704700297887677000, local_query_id:4, server_id:289, start_ts:446876954885881857,task_id:1> table_id=109"] [thread_id=66]
[2024/01/08 15:51:37.898 +08:00] [ERROR] [BaseDaemon.cpp:367] [########################################] [source=BaseDaemon] [thread_id=67]
[2024/01/08 15:51:37.898 +08:00] [ERROR] [BaseDaemon.cpp:368] ["(from thread 34) Received signal Bus error: 10(10)."] [source=BaseDaemon] [thread_id=67]
[2024/01/08 15:51:37.898 +08:00] [ERROR] [BaseDaemon.cpp:427] ["Invalid address alignment."] [source=BaseDaemon] [thread_id=67]
[2024/01/08 15:51:37.901 +08:00] [ERROR] [BaseDaemon.cpp:560] ["\n     0x1a562f18c\t___simple_bprintf [libsystem_platform.dylib+6445871500]"] [source=BaseDaemon] [thread_id=67]
[15:51:37]TiDB root:test>  SELECT * FROM t01 where col2 = -1881752 and col2 * -1881752 != 8366212;
(1105, 'rpc error: code = Unavailable desc = error reading from server: EOF')

4. What is your TiFlash version? (Required)

[15:56:45]TiDB root:test> select type,version,git_hash  from information_schema.cluster_info;
+---------+-------------+------------------------------------------+
| type    | version     | git_hash                                 |
+---------+-------------+------------------------------------------+
| tidb    | 7.6.0-alpha | 9b0fd9ea299266da70456f6e6077ed14bd191cfc |
| pd      | 7.6.0-alpha | c3ad361486ef29fb1a9001f5374788ad741d5616 |
| tikv    | 7.6.0-alpha | e0d70726b332a33e503c3f2addc66b9794303aea |
| tiflash | 7.6.0-alpha | 383e1bd9d3a793f6adb84156639febc68739ea63 |
+---------+-------------+------------------------------------------+
@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Jan 8, 2024

It should be something wrong with the arrow codec

[2024/01/08 23:29:01.290 +08:00] [ERROR] [BaseDaemon.cpp:367] [########################################] [source=BaseDaemon] [thread_id=532]
[2024/01/08 23:29:01.290 +08:00] [ERROR] [BaseDaemon.cpp:368] ["(from thread 416) Received signal Segmentation fault(11)."] [source=BaseDaemon] [thread_id=532]
[2024/01/08 23:29:01.290 +08:00] [ERROR] [BaseDaemon.cpp:398] ["Address: 0x7f833743b000"] [source=BaseDaemon] [thread_id=532]
[2024/01/08 23:29:01.290 +08:00] [ERROR] [BaseDaemon.cpp:404] ["Access: read."] [source=BaseDaemon] [thread_id=532]
[2024/01/08 23:29:01.290 +08:00] [ERROR] [BaseDaemon.cpp:410] ["Attempted access has violated the permissions assigned to the memory area."] [source=BaseDaemon] [thread_id=532]
[2024/01/08 23:29:04.383 +08:00] [ERROR] [BaseDaemon.cpp:560] ["
  0x56310990e441    faultSignalHandler(int, siginfo_t*, void*) [tiflash+124552257]
                    libs/libdaemon/src/BaseDaemon.cpp:211
  0x7f872a30dd90    <unknown symbol> [libc.so.6+347536]
  0x56310b656e75    memcpy [tiflash+155258485]
  0x563104333acb    DB::WriteBuffer::write(char const*, unsigned long) [tiflash+34527947]
                    dbms/src/IO/WriteBuffer.h:93
  0x56310ac960c1    DB::TiDBColumn::append(DB::TiDBEnum const&) [tiflash+145031361]
                    dbms/src/Flash/Coprocessor/TiDBColumn.cpp:116
  0x56310aafa935    void DB::flashEnumColToArrowCol<true>(DB::TiDBColumn&, DB::IColumn const*, unsigned long, unsigned long, DB::IDataType const*) [tiflash+143345973]
                    dbms/src/Flash/Coprocessor/ArrowColCodec.cpp:353
  0x56310aaf70b2    DB::flashColToArrowCol(DB::TiDBColumn&, DB::ColumnWithTypeAndName const&, tipb::FieldType const&, unsigned long, unsigned long) [tiflash+143331506]
                    dbms/src/Flash/Coprocessor/ArrowColCodec.cpp:485
  0x56310ac91c16    DB::TiDBChunk::buildDAGChunkFromBlock(DB::Block const&, std::__1::vector<tipb::FieldType, std::__1::allocator<tipb::FieldType>> const&, unsigned long, unsigned long) [tiflash+145013782]
                    dbms/src/Flash/Coprocessor/TiDBChunk.cpp:48
  0x56310ac854cc    DB::StreamingDAGResponseWriter<std::__1::shared_ptr<DB::AsyncMPPTunnelSetWriter>>::encodeThenWriteBlocks() [tiflash+144962764]
                    dbms/src/Flash/Coprocessor/StreamingDAGResponseWriter.cpp:128
  0x56310ae1598a    DB::ExchangeSenderSinkOp::writeImpl(DB::Block&&) [tiflash+146602378]
                    dbms/src/Operators/ExchangeSenderSinkOp.cpp:33
  0x56310ad51814    DB::SinkOp::write(DB::Block&&) [tiflash+145799188]
                    dbms/src/Operators/Operator.cpp:183
  0x56310ad4d9b9    DB::PipelineExec::executeImpl() [tiflash+145783225]
                    dbms/src/Flash/Pipeline/Exec/PipelineExec.cpp:125
  0x56310ad656f6    DB::PipelineTaskBase::runExecute() [tiflash+145880822]
                    dbms/src/Flash/Pipeline/Schedule/Tasks/PipelineTaskBase.h:72
  0x56310ad78a6b    DB::Task::execute() [tiflash+145959531]
                    dbms/src/Flash/Pipeline/Schedule/Tasks/Task.cpp:133
  0x56310ad7dc79    DB::TaskThreadPool<DB::CPUImpl>::handleTask(std::__1::unique_ptr<DB::Task, std::__1::default_delete<DB::Task>>&) [tiflash+145980537]
                    dbms/src/Flash/Pipeline/Schedule/ThreadPool/TaskThreadPoolImpl.h:32
  0x56310ad7d9a3    DB::TaskThreadPool<DB::CPUImpl>::doLoop(unsigned long) [tiflash+145979811]
                    dbms/src/Flash/Pipeline/Schedule/ThreadPool/TaskThreadPool.cpp:82
  0x56310ad7d43b    DB::TaskThreadPool<DB::CPUImpl>::loop(unsigned long) [tiflash+145978427]
                    dbms/src/Flash/Pipeline/Schedule/ThreadPool/TaskThreadPool.cpp:61
  0x56310ad7f166    void* std::__1::__thread_proxy[abi:v15001]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (DB::TaskThreadPool<DB::CPUImpl>::*)(unsigned long), DB::TaskThreadPool<DB::CPUImpl>*, unsigned long>>(void*) [tiflash+145985894]
                    /DATA/disk1/ra_common/tiflash-env-15/sysroot/bin/../include/c++/v1/__functional/invoke.h:359
  0x7f872a358802    start_thread [libc.so.6+653314]"] [source=BaseDaemon] [thread_id=532]

@yibin87
Copy link
Contributor

yibin87 commented Feb 6, 2024

Reproduced in latest once, try to reproduce it again.

@yibin87
Copy link
Contributor

yibin87 commented Feb 6, 2024

Reproduced using release binary THINLTO on, not reproducible using debug THINLTO off, a little strange.

@ti-chi-bot ti-chi-bot bot closed this as completed in #8767 Feb 7, 2024
ti-chi-bot bot pushed a commit that referenced this issue Feb 7, 2024
@ti-chi-bot ti-chi-bot added the affects-7.5 This bug affects the 7.5.x(LTS) versions. label Mar 4, 2024
@windtalker windtalker added affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. labels Mar 6, 2024
ti-chi-bot bot pushed a commit that referenced this issue Mar 6, 2024
ti-chi-bot bot pushed a commit that referenced this issue Mar 25, 2024
ti-chi-bot bot pushed a commit that referenced this issue Mar 26, 2024
ti-chi-bot bot pushed a commit that referenced this issue Mar 26, 2024
@ti-chi-bot ti-chi-bot added the affects-5.4 This bug affects the 5.4.x(LTS) versions. label Apr 9, 2024
@seiya-annie
Copy link

/found customer

@ti-chi-bot ti-chi-bot bot added the report/customer Customers have encountered this bug. label Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. component/compute report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants