Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thrift unmarshall bug in 1.12.3 release #617

Closed
hycdong opened this issue Oct 13, 2020 · 1 comment
Closed

thrift unmarshall bug in 1.12.3 release #617

hycdong opened this issue Oct 13, 2020 · 1 comment
Labels
type/bug This issue reports a bug.
Milestone

Comments

@hycdong
Copy link
Contributor

hycdong commented Oct 13, 2020

Bug Report

Pegasus version

Pegasus Server 1.12.3 (a948e89)

Coredump stack

(gdb) bt
#0  0x00007fbb92c8c1d7 in raise () from /lib64/libc.so.6
#1  0x00007fbb92c8d8c8 in abort () from /lib64/libc.so.6
#2  0x00007fbb92c85146 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007fbb92c851f2 in __assert_fail () from /lib64/libc.so.6
#4  0x00007fbb96c0cb9f in dsn::binary_reader::read (this=0x7fbb58330d20, buffer=buffer@entry=0x70b900000 "\340\031\367\256\005", sz=sz@entry=16777216)
    at /home/wutao1/pegasus-release/rdsn/src/core/core/binary_reader.cpp:80
#5  0x00000000005e30e5 in read (len=16777216, buf=0x70b900000 "\340\031\367\256\005", this=0x7fbb58330cc0)
    at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/cpp/serialization_helper/thrift_helper.h:66
#6  apache::thrift::transport::readAll<dsn::binary_reader_transport> (trans=..., buf=0x70b900000 "\340\031\367\256\005", len=16777216)
    at /home/wutao1/pegasus-release/rdsn/thirdparty/output/include/thrift/transport/TTransport.h:41
#7  0x00000000005e9702 in readAll (len=16777216, buf=<optimized out>, this=<optimized out>)
    at /home/wutao1/pegasus-release/rdsn/thirdparty/output/include/thrift/transport/TTransport.h:121
#8  apache::thrift::protocol::TBinaryProtocolT<apache::thrift::transport::TTransport, apache::thrift::protocol::TNetworkBigEndian>::readStringBody<dsn::blob_string> (
    this=this@entry=0x7fbb58330ce0, str=..., size=16777216) at /home/wutao1/pegasus-release/rdsn/thirdparty/output/include/thrift/protocol/TBinaryProtocol.tcc:445
#9  0x00000000005eb949 in readString<dsn::blob_string> (str=..., this=0x7fbb58330ce0)
    at /home/wutao1/pegasus-release/rdsn/thirdparty/output/include/thrift/protocol/TBinaryProtocol.tcc:408
#10 read (iprot=0x7fbb58330ce0, this=0x7fbb58330fb0) at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/cpp/serialization_helper/thrift_helper.h:458
#11 unmarshall_internal (value=..., iproto=0x7fbb58330ce0) at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/cpp/serialization_helper/thrift_helper.h:583
#12 unmarshall (value=..., iproto=0x7fbb58330ce0) at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/cpp/serialization_helper/thrift_helper.h:594
#13 unmarshall_base<dsn::blob> (val=..., iproto=0x7fbb58330ce0) at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/cpp/serialization_helper/thrift_helper.h:608
#14 dsn::unmarshall_thrift_internal<dsn::blob> (val=..., proto=proto@entry=0x7fbb58330ce0)
    at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/cpp/serialization_helper/thrift_helper.h:665
---Type <return> to continue, or q <return> to quit---
#15 0x00000000005efc93 in unmarshall_thrift_binary<dsn::blob> (val=..., reader=...) at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/cpp/serialization_helper/thrift_helper.h:703
#16 unmarshall<dsn::blob> (fmt=<optimized out>, value=..., reader=...) at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/cpp/serialization.h:73
#17 dsn::unmarshall<dsn::blob> (msg=msg@entry=0x2d4906b14, val=...) at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/cpp/serialization.h:101
#18 0x00000000005effe9 in operator() (r=0x2d4906b14, p=0x414fb800, __closure=0x220a138) at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/dist/replication/storage_serverlet.h:27
#19 std::_Function_handler<void (dsn::apps::rrdb_service*, dsn::message_ex*), bool dsn::replication::storage_serverlet<dsn::apps::rrdb_service>::register_async_rpc_handler<dsn::blob, dsn::apps::read_response>(dsn::task_code, char const*, void (*)(dsn::apps::rrdb_service*, dsn::blob const&, dsn::rpc_replier<dsn::apps::read_response>&))::{lambda(dsn::apps::rrdb_service*, dsn::message_ex*)#1}>::_M_invoke(std::_Any_data const&, dsn::apps::rrdb_service*, dsn::message_ex*) (__functor=..., __args#0=0x414fb800, __args#1=0x2d4906b14)
    at /home/wutao1/app/include/c++/4.8.2/functional:2071
#20 0x0000000000619a8c in operator() (__args#1=0x2d4906b14, __args#0=0x414fb800, this=<optimized out>) at /home/wutao1/app/include/c++/4.8.2/functional:2464
#21 handle_request (request=0x2d4906b14, this=0x414fb800) at /home/wutao1/pegasus-release/DSN_ROOT/include/dsn/dist/replication/storage_serverlet.h:80
#22 dsn::apps::rrdb_service::on_request (this=0x414fb800, request=0x2d4906b14) at /home/wutao1/pegasus-release/src/include/rrdb/rrdb.server.h:17
#23 0x00007fbb96ac9ab5 in dsn::replication::replica::on_client_read (this=0x2d070c00, request=request@entry=0x2d4906b14)
    at /home/wutao1/pegasus-release/rdsn/src/dist/replication/lib/replica.cpp:177
#24 0x00007fbb96b3641f in dsn::replication::replica_stub::on_client_read (this=0x2af6000, id=..., request=0x2d4906b14)
    at /home/wutao1/pegasus-release/rdsn/src/dist/replication/lib/replica_stub.cpp:811
#25 0x00007fbb96c4afd9 in dsn::task::exec_internal (this=this@entry=0x2d4906cac) at /home/wutao1/pegasus-release/rdsn/src/core/core/task.cpp:180
#26 0x00007fbb96c5f22d in dsn::task_worker::loop (this=0x2b17130) at /home/wutao1/pegasus-release/rdsn/src/core/core/task_worker.cpp:211
#27 0x00007fbb96c5f3f9 in dsn::task_worker::run_internal (this=0x2b17130) at /home/wutao1/pegasus-release/rdsn/src/core/core/task_worker.cpp:191
#28 0x00007fbb935e4600 in std::(anonymous namespace)::execute_native_thread_routine (__p=<optimized out>)
    at /home/qinzuoyan/git.xiaomi/pegasus/toolchain/objdir/../gcc-4.8.2/libstdc++-v3/src/c++11/thread.cc:84
#29 0x00007fbb940f6dc5 in start_thread () from /lib64/libpthread.so.0
#30 0x00007fbb92d4e73d in clone () from /lib64/libc.so.6

Frame 4

(gdb) f 4
#4  0x00007fbb96c0cb9f in dsn::binary_reader::read (this=0x7fbb58330d20, buffer=buffer@entry=0x70b900000 "\340\031\367\256\005", sz=sz@entry=16777216)
    at /home/wutao1/pegasus-release/rdsn/src/core/core/binary_reader.cpp:80
(gdb) p sz
$1 = 16777216
(gdb) p _remaining_size
$2 = 42

Frame 23

(gdb) f 23
#23 0x00007fbb96ac9ab5 in dsn::replication::replica::on_client_read (this=0x2d070c00, request=request@entry=0x2d4906b14)
    at /home/wutao1/pegasus-release/rdsn/src/dist/replication/lib/replica.cpp:177
(gdb) p *(*request).header
$6 = {hdr_type = 1413892180, hdr_version = 0, hdr_length = 192, hdr_crc32 = 0, body_length = 76, body_crc32 = 0, id = 160, trace_id = 0, 
  rpc_name = "RPC_RRDB_RRDB_GET", '\000' <repeats 30 times>, rpc_code = {local_code = 207, local_hash = 0}, gpid = {_value = {u = {app_id = 7, partition_index = 27}, 
      value = 115964116999}}, context = {u = {is_request = 1, is_forwarded = 0, unused = 0, serialize_format = 1, is_forward_supported = 0, reserved = 0}, context = 65}, from_address = {
    static s_invalid_address = {static s_invalid_address = <same as static member of an already seen type>, _addr = {v4 = {type = 0, padding = 0, port = 0, ip = 0}, group = {type = 0, 
          group = 0}, value = 0}}, _addr = {v4 = {type = 1, padding = 0, port = 41404, ip = 175579414}, group = {type = 1, group = 188526960923574272}, value = 754107843694297089}}, 
  client = {timeout_ms = 0, thread_hash = 55460, partition_hash = 0}, server = {error_name = '\000' <repeats 47 times>, error_code = {local_code = 0, local_hash = 0}}}

Simple analysis
The stack shows server received a read request from user client whose body_length is 76, then server would unmarshall the request body into a blob structure, however, something wrong happened during unmarshall leading this coredump. I can only know that the blob size may be calculated wrong which is 16777216, the root cause is not found right now.

@levy5307
Copy link
Contributor

Fixed at #790

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug This issue reports a bug.
Projects
None yet
Development

No branches or pull requests

2 participants