Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL][SQLsmith] Segmentation fault in yb::pggate::YBCPgResetOperationsBuffering() #11371

Open
def- opened this issue Feb 5, 2022 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug kind/failing-test Tests and testing infra priority/medium Medium priority issue qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures

Comments

@def-
Copy link
Contributor

def- commented Feb 5, 2022

Jira Link: DB-993

Description

Found while looking for postgis problems, but doesn't seem postgis specific. Can't reproduce it, ran on 2.11.2.0 on CentOS:

Core was generated by `postgres: yugabyte postgis_reg 127.0.0.1(53780) SELECT                        '.
Program terminated with signal 11, Segmentation fault.
#0  yb::pggate::PgApiImpl::ResetOperationsBuffering (this=0x0) at ../../src/yb/yql/pggate/pggate.cc:1017
#1  0x00007fb865789871 in yb::pggate::YBCPgResetOperationsBuffering () at ../../src/yb/yql/pggate/ybc_pggate.cc:621
#2  0x0000000000a38ca6 in YBResetOperationsBuffering () at ../../../../../../../src/postgres/src/backend/utils/misc/pg_yb_utils.c:1412
#3  0x000000000087f8f3 in yb_exec_query_wrapper (exec_context=exec_context@entry=0x2282000, restart_data=restart_data@entry=0x7fff836fadf0, functor=functor@entry=0x884620 <yb_exec_simple_query_impl>, functor_context=functor_context@entry=0x2282938) at ../../../../../../src/postgres/src/backend/tcop/postgres.c:4424
#4  0x00000000008802ec in yb_exec_simple_query (query_string=query_string@entry=0x2282938 "select  \n  (select id from tm.compoundcurvem4326 limit 1 offset 3)\n     as c0, \n  (select g from tm.tinz limit 1 offset 2)\n     as c1, \n  case when (((cast(null as anyrange) > cast(null as anyrange)) "..., exec_context=exec_context@entry=0x2282000) at ../../../../../../src/postgres/src/backend/tcop/postgres.c:4449
#5  0x0000000000882213 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x227dfe8, dbname=0x2303fe8 "postgis_reg", username=0x2313fe8 "yugabyte") at ../../../../../../src/postgres/src/backend/tcop/postgres.c:5084
#6  0x000000000049e292 in BackendRun (port=0x216c960) at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:4470
#7  BackendStartup (port=0x216c960) at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:4136
#8  ServerLoop () at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:1754
#9  0x00000000007ea21f in PostmasterMain (argc=argc@entry=23, argv=argv@entry=0x2046000) at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:1417
#10 0x000000000073588a in PostgresServerProcessMain (argc=23, argv=0x2046000) at ../../../../../../src/postgres/src/backend/main/main.c:234
#11 0x0000000000735a89 in main ()

Query that ran:

select
  (select id from tm.compoundcurvem4326 limit 1 offset 3)
     as c0,
  (select g from tm.tinz limit 1 offset 2)
     as c1,
  case when (((cast(null as anyrange) > cast(null as anyrange))
          or (subq_0.c4 is NULL))
        or (case when subq_0.c5 is not NULL then (select pg_catalog.min(t) from public.g)
               else (select pg_catalog.min(t) from public.g)
               end
             >= (select wkt from public.test_data limit 1 offset 2)
            ))
      and (cast(null as box) <@ case when true then cast(null as box) else cast(null as box) end
          ) then subq_0.c1 else subq_0.c1 end
     as c2,
  case when cast(null as inet) = pg_catalog.inet_client_addr() then subq_0.c8 else subq_0.c8 end
     as c3,
  subq_0.c6 as c4,
  subq_0.c1 as c5,
  subq_0.c4 as c6,
  subq_0.c7 as c7
from
  (select
        ref_0.id as c0,
        ref_0.id as c1,
        (select id from tm.polyhedralsurfacezm limit 1 offset 4)
           as c2,
        ref_0.g as c3,
        ref_0.id as c4,
        ref_0.g as c5,
        ref_0.id as c6,
        ref_0.id as c7,
        ref_0.g as c8
      from
        tm.multipolygonzm4326 as ref_0
      where ((true)
          and (cast(null as float4) >= cast(null as float4)))
        or ((ref_0.gg is NULL)
          and (((select type from public.geography_columns limit 1 offset 2)
                 @@ (select serialized from public.serialize_test limit 1 offset 69)
                )
            or (cast(null as "timestamp") <> cast(null as timestamptz))))
      limit 114) as subq_0
where public.st_zmax(
    cast(case when false then case when subq_0.c3 <= subq_0.c8 then case when cast(null as text) > (select serialized from public.serialize_test limit 1 offset 4)
               then cast(null as box3d) else cast(null as box3d) end
           else case when cast(null as text) > (select serialized from public.serialize_test limit 1 offset 4)
               then cast(null as box3d) else cast(null as box3d) end
           end
         else case when subq_0.c3 <= subq_0.c8 then case when cast(null as text) > (select serialized from public.serialize_test limit 1 offset 4)
               then cast(null as box3d) else cast(null as box3d) end
           else case when cast(null as text) > (select serialized from public.serialize_test limit 1 offset 4)
               then cast(null as box3d) else cast(null as box3d) end
           end
         end
       as box3d)) = pg_catalog.atand(
    cast(cast(nullif(pg_catalog.path_length(
        cast(cast(null as path) as path)),
      case when (subq_0.c0 is NULL)
          or (cast(null as lseg) @ cast(null as box)) then cast(nullif(pg_catalog.pg_notification_queue_usage(),
          (select d from public.knn_cpa_no_index limit 1 offset 1)
            ) as float8) else cast(nullif(pg_catalog.pg_notification_queue_usage(),
          (select d from public.knn_cpa_no_index limit 1 offset 1)
            ) as float8) end
        ) as float8) as float8))
limit 173;

Data export: postgis_reg.sql.zip
Coredump: core.8011.zip

This is probably caused by another thread handling a signal at the same time, there were multiple signal handling related crashes already:

(gdb) info threads
  Id   Target Id         Frame
  3    LWP 8085          pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  2    LWP 8014          syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
* 1    LWP 8011          yb::pggate::PgApiImpl::ResetOperationsBuffering (this=0x0)
    at ../../src/yb/yql/pggate/pggate.cc:1017
(gdb) thread 2
[Switching to thread 2 (LWP 8014)]
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
38	../sysdeps/unix/sysv/linux/x86_64/syscall.S: No such file or directory.
(gdb) bt
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00007fb86990984b in munmap ()
   from /nfusr/dev-server/dfelsing/yugabyte-2.11.2.0/postgres/../lib/yb-thirdparty/libtcmalloc.so.4
#2  0x00007fb864b2e65a in __free_stacks (limit=limit@entry=41943040) at allocatestack.c:288
#3  0x00007fb864b2e78f in queue_stack (stack=0x7fb84eaad700) at allocatestack.c:312
#4  __deallocate_stack (pd=pd@entry=0x7fb84eaad700) at allocatestack.c:759
#5  0x00007fb864b2f5b9 in __free_tcb (pd=pd@entry=0x7fb84eaad700) at pthread_create.c:243
#6  0x00007fb864b3092c in pthread_join (threadid=140429570529024, thread_return=thread_return@entry=0x0)
    at pthread_join.c:111
#7  0x00007fb865c2512e in yb::ThreadJoiner::Join (this=this@entry=0x7fb84faa60e0)
    at ../../src/yb/util/thread.cc:647
#8  0x00007fb865c2537b in yb::Thread::Join (this=<optimized out>) at ../../src/yb/util/thread.cc:781
#9  0x00007fb85aabee3c in Join (this=<optimized out>) at ../../src/yb/rpc/io_thread_pool.cc:69
#10 yb::rpc::IoThreadPool::Join (this=this@entry=0x2143ee0) at ../../src/yb/rpc/io_thread_pool.cc:100
#11 0x00007fb85aac93f9 in yb::rpc::Messenger::Shutdown (this=0x2143c00) at ../../src/yb/rpc/messenger.cc:219
#12 0x00007fb86579648e in yb::pggate::PgApiImpl::~PgApiImpl (this=0x226b600, __in_chrg=<optimized out>)
    at ../../src/yb/yql/pggate/pggate.cc:268
#13 0x00007fb8657968e1 in yb::pggate::PgApiImpl::~PgApiImpl (this=0x226b600, __in_chrg=<optimized out>)
    at ../../src/yb/yql/pggate/pggate.cc:271
#14 0x00007fb86578781b in yb::pggate::YBCDestroyPgGate () at ../../src/yb/yql/pggate/ybc_pggate.cc:127
#15 0x0000000000a3713c in YBOnPostgresBackendShutdown ()
    at ../../../../../../../src/postgres/src/backend/utils/misc/pg_yb_utils.c:512
#16 0x000000000087c59b in quickdie (postgres_signal_arg=<optimized out>)
    at ../../../../../../src/postgres/src/backend/tcop/postgres.c:2683
#17 <signal handler called>
#18 0x00007fb86426c9f3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
#19 0x00007fb865bb1f2f in boost::asio::detail::epoll_reactor::run (this=0x22d00d0, usec=<optimized out>, ops=...)
    at /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20211222064200-dd4872fe56-centos7-x86_64-linuxbrew-gcc5/installed/uninstrumented/include/boost/asio/detail/impl/epoll_reactor.ipp:471
#20 0x00007fb85aac0b01 in do_run_one (ec=..., this_thread=..., lock=..., this=0x2044780)
    at /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20211222064200-dd4872fe56-centos7-x86_64-linuxbrew-gcc5/installed/uninstrumented/include/boost/asio/detail/impl/scheduler.ipp:385
#21 run (ec=..., this=0x2044780)
    at /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20211222064200-dd4872fe56-centos7-x86_64-linuxbrew-gcc5/installed/uninstrumented/include/boost/asio/detail/impl/scheduler.ipp:154
#22 run (this=<optimized out>, ec=...)
    at /opt/yb-build/thirdparty/yugabyte-db-thirdparty-v20211222064200-dd4872fe56-centos7-x86_64-linuxbrew-gcc5/installed/uninstrumented/include/boost/asio/impl/io_context.ipp:70
#23 yb::rpc::IoThreadPool::Impl::Execute (this=<optimized out>) at ../../src/yb/rpc/io_thread_pool.cc:76
#24 0x00007fb865c27705 in operator() (this=0x204ea78)
    at /nfusr/dev-server/dfelsing/yugabyte-2.11.2.0/linuxbrew-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/Cellar/gcc/5.5.0_4/include/c++/5.5.0/functional:2267
#25 yb::Thread::SuperviseThread (arg=0x204ea20) at ../../src/yb/util/thread.cc:774
#26 0x00007fb864b2f694 in start_thread (arg=0x7fb84faaf700) at pthread_create.c:333
#27 0x00007fb86426c41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
@def- def- added kind/bug This issue is a bug area/ysql Yugabyte SQL (YSQL) labels Feb 5, 2022
@yugabyte-ci yugabyte-ci added the priority/medium Medium priority issue label Jun 8, 2022
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug labels Aug 22, 2022
@kripasreenivasan kripasreenivasan added the qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures label Sep 13, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug and removed kind/enhancement This is an enhancement of an existing feature labels Sep 13, 2022
@yugabyte-ci yugabyte-ci added the kind/failing-test Tests and testing infra label Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug kind/failing-test Tests and testing infra priority/medium Medium priority issue qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures
Projects
None yet
Development

No branches or pull requests

4 participants